Estimation of P ( Y < X ) in a Four-Parameter Generalized Gamma Distribution

Abstract: In this paper we consider estimation of R = P (Y < X), when X and Y are distributed as two independent four-parameter generalized gamma random variables with same location and scale parameters. A modified maximum likelihood method and a Bayesian technique have been used to estimate R on the basis of independent samples. As the Bayes estimator cannot be obtained in a closed form, it has been implemented using importance sampling procedure. A simulation study has also been carried out to compare the two methods.


Introduction
Stress-strength reliability is one of the main tools of reliability analysis of structures.A stress-strength system fails as soon as the applied stress Y is at least as large as its strength X.This model is also known as the load-capacity model in the context of solid mechanics or structural engineering.Inference regarding P (Y < X), defining the reliability of the system, has been widely discussed in literature, when Xand Y are assumed to be independent random variables.See, for example, Basu (1964), Downtown (1973), Tong (1974Tong ( , 1977)), Kelley, Kelley, and Suchany (1976), Beg (1980), Iwase (1987), McCool (1991), Ivshin (1996), Ali, Woo, and Pal (2004); Ali, Pal, andWoo (2005, 2010), Ali andWoo (2005a, 2005b), Pal, Ali, and Woo (2005), Raqab and Kundu (2005), and Raqab, Madi, and Kundu (2008).Besides, system reliability, P (Y < X) finds importance in other fields too.For example, in biometry, suppose X represents a patient's remaining years of life when treated with drug A and Y represents the same when treated with drug B.Then, if choice of drug is left to the patient, his deliberation will center on whether P (Y < X) is less than or greater than 1/2.In the context of statistical tolerance, if X denotes the diameter of a shaft and Y the diameter of a bearing that is to be mounted on the shaft, then the probability that the bearing fits without any interference is given by P (Y < X).Hence, it is very important to consider inference on P (Y < X).
In this paper, we consider the problem of estimating R = P (Y < X), where X and Y are distributed independently as generalized gamma distributions.A four-parameter generalized gamma distribution may be defined as having the cumulative distribution function (cdf) and the probability density function (pdf) Here θ and β are the location and scale parameters, respectively, and (α, γ) are the shape parameters.We shall denote the distribution by GG(α, β, γ, θ).For γ = 1 and θ = 0 this distribution reduces to the standard twoparameter gamma distribution, whereas for α = 1, it reduces to the three-parameter generalized exponential distribution studied by Gupta and Kundu (1999).
In this paper, we assume that X ∼ GG(α, β, γ 1 , θ) and Y ∼ GG(α, β, γ 2 , θ).It has been observed that the usual maximum likelihood estimator of the parameters may not exist.In Section 2, we study modified maximum likelihood estimators of the unknown parameters and hence of R. In Section 3, importance sampling is used to obtain Bayes estimates of the model parameters and of R. In Section 4, the procedures are illustrated by analyzing a simulated and a real data set.Finally, in Section 5 some simulation studies are provided and in Section 6 a discussion on our findings is given.
Since the likelihood function is maximized at θ = w, the modified MLE of θ is θ = w.The modified likelihood function of ρ * = (α, β, γ 1 , γ 2 ) is then defined as follows.
Case 1: y (1) < x (1) : Here the modified likelihood function is given by where Then, The modified MLEs of γ 1 and γ 2 are obtained by solving the equations Here α and β are the modified MLEs of α and β, respectively, satisfying the following non-linear equations, which are obtained by maximizing the modified profile likelihood function of (α, β), i.e. where Case 2: x (1) < y (1) : Here the modified likelihood function is given by where The modified MLE of ρ * can be obtained in the same way as in Case 1.Here we get, , where α and β are the modified MLEs of α and β, respectively, obtained by maximizing the modified profile likelihood function of (α, β), which give the following non-linear equations with .
The expression of R can be easily shown to be R = γ 1 /(γ 1 +γ 2 ).Hence, the modified MLE of R is given by R = γ1 γ1 + γ2 .
It is difficult to obtain both the exact and the asymptotic distributions of R, and thereby find a confidence interval for R. One may, however, use the parametric bootstrap technique proposed by Efron (1982) to get a confidence interval.

Bayesian Estimation
A natural and simple choice for the priors of α, β, γ 1 , γ 2 and θ is to assume that these are independently distributed as follows: and that θ has a truncated exponential distribution with pdf The prior parameters should be chosen so as to reflect the prior knowledge about α, β, γ 1 , γ 2 and θ.

Posterior Distribution
The joint distribution of X, Y, ρ has pdf where D(β, θ), S(α, β, θ) are given in (1) and and I θ<w is defined in (2).Moreover, The posterior distribution of ρ, given X = x and Y = y then comes out to be where , and w 0 = min(θ 0 , w), g Z (t, s) is the pdf of a gamma variable Z with shape and scale parameters t and s, respectively, and h Z (t, s) is the pdf of a truncated exponential variable Z with parameters t and s.

Posterior Expectation
For any continuous function k(•) of ρ, the posterior expectation is given by ) where Therefore, we get where E 1 (•) denotes the expectation under Hence, to find the posterior expectation of a function, we can use the following general importance sampling procedure: Step Step 2: For β obtained in step 1, generate θ from the truncated exponential(ξ + (m + n)β, w 0 ).
Step 5: From steps 1 to 4 compute (7) by averaging the numerator and denominator with respect to the simulations.

Highest Probability Density Intervals
A Monte Carlo method has been developed by Chen and Shao (1999) for using importance sampling to compute highest probability density (HPD) intervals for parameters and any function of them.The method can be used to find HPD intervals for the model parameters α, β, γ 1 , γ 2 , θ and also for R.
We compute Then, for q sufficiently large, the 100(1 − γ) % HPD interval for λ is given by the shortest interval among I j (q), j = 1, 2, . . ., γq, where λ(δ) is an estimate of the δ-th quantile of λ and is given by We can find HPD intervals for α, β, γ 1 , γ 2 , θ and R in this way.

Data Analysis
In this section we illustrate the procedures discussed by analyzing simulated data sets and real life data sets.
In order to check which estimation procedure gives better fit to the given data sets, we have computed the Kolmogorov-Smirnov (K-S) distances between the empirical and the fitted distribution function, based on the modified MLEs and on the Bayes estimators and tested at a 5 % level of significance.For data set 1, The K-S distance based on modified MLEs (Bayes estimates) is 0.2217 (0.1523) and the corresponding p-value is 0.391 (0.629).Similarly, for data set 2, the K-S distance based on modified MLEs (Bayes estimates) is 0.1636 (0.2011) and the corresponding p-value is 0.716 (0.322).Thus, for data set 1, Bayes estimates provide better fit than modified MLEs while for data set 2, modified MLEs give better fit than Bayes estimates.
To examine which set of parameter estimates gives better fit to the data sets, we compute the K-S distance between the empirical and the fitted distributions based on the modified MLEs and the Bayes estimators and test at a 5 % level of significance.For data set 1, the p-value comes out to be 0.3003 (0.0225) for the modified MLEs (Bayes estimators), and for data set 2, the p-value is 0.7123 (0.4476) for the modified MLEs (Bayes estimators).Hence, the modified MLEs give a better fit than the Bayes estimates.
The following figures show the plots of the empirical survival functions and the fitted survival functions.The plots also indicate that the modified maximum likelihood method of estimation provides better fit than the Bayes method of estimation.

A Monte Carlo Simulation Study
A simulation study has been carried out to compare the two methods of estimation used.We take parameter values to be γ 1 , γ 2 = 0.5, 1.0, 1.5 and 2.0.Without loss of generality, we have taken θ = 0, α = 1.5 and β = 1.0.We consider sample sizes to be (m, n) = (10, 10), (20, 20), (40, 40).For a particular set of parameters and from a given generated sample, we compute the modified MLEs and Bayes estimators of R and replicate the process 1000 times.For the Bayes estimator of R, we have taken small values of the exponential hyper parameters to reflect vague prior information, viz. a = b = c 1 = c 2 = 1.We also assumed that ξ = 1 and θ 0 = 2. Forty thousand simulated values of θ, α, β, γ 1 and γ 2 are used to implement the importance sampling procedure.Next we compute the mean squared errors in each case.The results are reported in Table 1 to 4. Table 1: MSEs of R and R when γ 1 = 0.5.
(10, 10) 0.0150 0.0175 0.0113 0.0109 0.0243 0.0314 0.0162 0.0234 (20, 20) 0.0086 0.0096 0.0091 0.0098 0.0125 0.0213 0.0097 0.0120 (40, 40) 0.0056 0.0076 0.0067 0.0072 0.0079 0.0089 0.0064 0.0077 It is observed that for both the methods of estimation, as the sample sizes increase the MSEs decrease for all sets of parameters considered.However, though in most of the cases the MSE is lower in modified maximum likelihood method than in Bayes method, it is not consistently so.Thus, it is not possible to conclude that the modified maximum likelihood method of estimation always gives better fit than the Bayes method of estimation.

Discussion
The paper studies the estimation of R = P (Y < X) when X and Y have independent four-parameter generalized gamma distributions.It is seen that the usual maximum likelihood estimators of the distribution parameters may not exist.Thus, a modified maximum likelihood procedure has been used for parameter estimation.Further, Bayesian estimation with importance sampling procedure has been employed to estimate the model parameters and hence R. Simulated data sets and real-life data sets have been analyzed using the two methods of estimation.Also, a simulation study has been conducted to compare the two methods of estimation.It may be noted that the maximum likelihood method is a classical approach to estimation of parameters, while Bayes method is advised when one has informative priors.The present paper uses both the methods of estimation with the intention of studying how the estimators can be obtained in a complex situation as discussed in the paper.

Figure 1 :
Figure 1: Empirical survival function and the fitted survival functions for Data Set 1.

Figure 2 :
Figure 2: Empirical survival function and the fitted survival functions for Data Set 2.