Another Generalized Transmuted Family of Distributions : Properties and Applications

We introduce and study general mathematical properties of a new generator of continuous distributions with two extra parameters called the Another generalized transmuted family of distributions. We present some special models. We investigate the asymptotes and shapes. The new density function can be expressed as a linear combination of exponentiated densities based on the same baseline distribution. We obtain explicit expressions for the ordinary and incomplete moments and generating functions, Bonferroni and Lorenz curves, asymptotic distribution of the extreme values, Shannon and Rényi entropies and order statistics, which hold for any baseline model, certain characterisations are presented. Further, we introduce a bivariate extensions of the new family. We discuss the different method of estimation of the model parameters and illustrate the potentiality of the family by means of two applications to real data. A brief simulation for evaluating Maximum likelihood estimator is done.


Introduction
Numerous classical distributions have been extensively used over the past decades for modeling data in several areas such as engineering, actuarial, environmental and medical sciences, biological studies, demography, economics, finance and insurance.However, in many applied areas such as lifetime analysis, finance and insurance, there is a clear need for extended forms of these distributions.For that reason, several methods for generating new families of distributions have been studied.Some attempts have been made to define new families of probability distributions that extend well-known families of distributions and at the same time provide great flexibility in modeling data in practice.
Let r(t) be the probability density function (pdf) of a random variable T ∈ [a, b] for −∞ < a < b < ∞ and let W [G(x)] be a function of the cumulative distribution function (cdf) of a random variable X satisfying the following conditions: ] is differentiable and monotonically non-decreasing, and (iii) W [G(x)] → a as x → −∞ and W [G(x)] → b as x → ∞.
(1) Recently, Alzaatreh et al. (2013) defined the T-X family of distributions by where W [G(x)] satisfies the conditions (1).The pdf corresponding to (2) is given by Taking W [G(x)] = 1 − Ḡ(x) α and r(t) = 1 + λ − 2 λ t , 0 < t < 1, we define the cumulative distribution function (cdf) of the Another Generalized Transmuted Class (AGT-G for short) of distributions by where G(x; ξ) is the baseline cdf depending on a parameter vector ξ and α > 0 and |λ| ≤ 1 are two additional shape parameters.For each baseline cdf G, the AGT-G family of distributions is defined by the cdf (4).It includes the Transmuted family of distributions and the proportional reversed hazard rate models.Some special models are given in Table 1.
This paper is organized as follows.In Section 2, we define the AGT-G family.Three special cases of this family are defined in Section 3. In Section 4, the asymptotic and shape of the density and hazard rate functions are expressed analytically.Some useful expansions are derived in Section 5.In Section 6, we provide explicit expressions for the moments, incomplete moments, generating function and mean deviation.Extreme values are discussed in Section 7.
General expressions for the Rényi and Shannon entropies are presented in Section 8. General results for order statistics are obtained in Section 9. Certain characterisations are given in Section 10.In Section 11, we introduce a bivariate extension of the new family.Estimation procedures of the model parameters are presented in Section 12. Applications to two real data sets illustrate the performance of the new family in Section 13.The paper is concluded in Section 14.
Table 1: Some known special cases of the AGT-G model.

The new family
The corresponding density function to (4) is given by where g(x; ξ) is the baseline pdf.Equation (5) will be most tractable when the cdf G(x) and the pdf g(x) have simple analytic expressions.Hereafter, a random variable X with density function (5) is denoted by X ∼ AGT-G(α, λ, ξ).Further, we can omit (sometimes) the dependence on the vector ξ of the parameters and simply write G(x) = G(x; ξ).
The hazard rate function (hrf) of X becomes To motivate the new family, let Z 1 , Z 2 be i.i.d random variables from 1 − Ḡ(x; ξ) α and Z 1:2 = min{Z 1 , Z 2 } and Z 2:2 = max{Z 1 , Z 2 }, and let , which is the proposed family.The AGT-G family of distributions is easily simulated by inverting (4) as follows: if U has a uniform U (0, 1) distribution,then has the density function (5).

Special AGT-G distributions
In the following sections, we study some mathematical properties of AGT-G distribution since it extends several widely-known distributions in the literature.First, we discuss some special AGT-G distributions.

AGT-Exponential(AGT-E) distribution
The parent exponential distribution has pdf and cdf given, respectively, by and The cdf and pdf of AGT-Exponential distribution are given by (x > 0) Figure 1 illustrates some of the possible shapes of the pdf of the AGT-E distribution.
The expectation and variance of AGT-E are:

AGT-Fréchet (AGT-F) distribution
The parent Fréchet distribution has cdf and pdf given, respectively, by The cdf and pdf of AGT-Fréchet distribution are given by (x > 0) : The cdf and pdf of AGT-Normal distribution are given by: and

The AGT-Uniform (AGT-U) distribution
The parent uniform distribution in the interval (0, θ), θ > 0 has cdf and pdf given, respectively, by The cdf and pdf of AGT-Uniform distribution are given by:

The AGT-Weibull (AGT-W) distribution
The parent Weibull distribution has cdf and pdf given by, respectively: The cdf and pdf of AGT-Weibull distribution are given by Figure 2 illustrates possible shapes of the density functions for some AGT-Weibull distributions.

Asymptotics and shapes
Proposition 1 The asymptotics of equations ( 4), ( 5) and (6) as G(x) → 0 are given by Proposition 2 The asymptotics of equations ( 4), ( 5) and ( 6) as x → ∞ are given by The shapes of the density and hazard rate functions can be described analytically.The critical points of the AGT-G density function are the roots of the equation The critical points of h(x) are obtained from the equation

Useful expansions
By using generalized binomial expansion we can show that the cdf (4) of X has the expansion where c 0 = 0 and for k ≥ 1, and H a (x) = (G(x)) a denotes the exponentiated-G ("exp-G" for short) cumulative distribution.We can prove ∞ k=0 c k = 1.Some structural properties of the exp-G distributions are studied by Mudholkar et al. (1996), Gupta and Kundu (2001) and Nadarajah and Kotz (2006), among others.
The density function of X can be expressed as an infinite linear combination of exp-G density functions where is the exp-G density with power parameter k + 1.
Equation ( 30) reveals that the AGT-G density function is a linear combination of exp-G density functions.Thus, some mathematical properties of the new model can be derived from those properties of the exp-G distribution.For example, the ordinary and incomplete moments and moment generating function (mgf) of X can be obtained from those quantities of the exp-G distribution.
The formulae derived throughout the paper can be easily handled in most symbolic computation software plataforms such as Maple, Mathematica and Matlab.These plataforms have currently the ability to deal with analytic expressions of formidable size and complexity.
Established explicit expressions to calculate statistical measures can be more efficient than computing them directly by numerical integration.The infinity limit in these sums can be substituted by a large positive integer such as 20 or 30 for most practical purposes.

Moments
Let Y k be a random variable with exp-G distribution with power parameter k + 1, i.e., with density h k+1 (x).A first formula for the nth moment of X ∼AGT-G follows from (30) as where Expressions for moments of several exp-G distributions are given in Nadarajah and Kotz (2006b), which can be used to obtained E(X n ).
A second formula for E(X n ) can be written from (31) in terms of the G quantile function as where Nadarajah et al. (2011) obtained τ (n, k) for some well known distribution such as Normal, Beta, Gamma and Weibull distributions, which can be used to find moments of AGT-G.
For empirical purposes, the shape of many distributions can be usefully described by what we call the incomplete moments.These types of moments play an important role in measuring inequality, for example, income quantiles and Lorenz and Bonferroni curves, which depend on the incomplete moments of a distribution.The nth incomplete moment of X is The last integral can be computed for most G distributions.

Generating function
Let M X (t) = E(e t X ) be mgf of X ∼ AGT-G,then, the first form of M X (t) comes from ( 30) as where M k (t) is the mgf of Y k .Hence, M X (t) can be determined from the exp-G generating function.
A second formula for M X (t) can be derived from (30) as where We can obtain the mgfs of several distributions directly from equation (35).

Mean deviation
The mean deviation about the mean (δ 1 = E(|X − µ 1 |)) and about the median (δ 2 = E(|X − M |)) of X can be expressed as respectively, where µ 1 = E(X), M = M edian(X) is the median defined by M = Q(0.5),F (µ 1 ) is easily calculated from the cdf (4) and m 1 (z) = z −∞ xf (x)dx is the first incomplete moment obtained from (33) with n = 1.Now, we provide two alternative ways to compute δ 1 and δ 2 .A general equation for m 1 (z) can be derived from (30) as where dx is the basic quantity to compute the mean deviation for the exp-G distributions.Hence, the mean deviation in (36) depend only on the mean deviation of the exp-G distribution.So, alternative representations for δ 1 and δ 2 are A simple application of J k (z) refers to the the AGT-G distribution discussed in Section 3.5.The exponentiated Weibull with parameter k + 1 has pdf (for x > 0) given by and then The last integral is just the incomplete gamma function and then the mean deviation for the AGT-G distribution can be determined from A second general formula for m 1 (z) can be derived by setting u = G(x) in ( 30) where Remark: Applications of these equations employed to obtain Bonferroni and Lorenz curves defined for a given probability π by respectively, where µ 1 = E(X) and q = Q(π) is the qf of X at π.

Extreme values
Let X = (X 1 + • • • + X n )/n denote the mean of a random sample from (5), then by the usual central limit theorem √ n(X −E(X))/ V ar(X) approaches the standard normal distribution as n → ∞ under suitable conditions.Sometimes one would be interested in the asymptotics of the extreme values M n = max{X 1 , . . ., X n } and m n = min{X 1 , . . ., X n }.
First, suppose G belongs to the max domain of attraction of Gumbel extreme value distribution.Then by Leadbetter, Lindgren, and Rootzén (2012) (chapter 1), there must exist a strictly positive function, say h(t), such that for every x ∈ (−∞, ∞).So, it follows from Leadbetter et al. (2012) (chapter 1) that F belongs to the max domain of attraction of the Gumbel extreme value distribution with for some suitable norming constants a n > 0 and b n .Second, suppose G belongs to the max domain of attraction of the Fréchet extreme value distribution.Then from Leadbetter et al.
(2012) (Chapter 1), there must exist a β > 0 such that for some suitable norming constants a n > 0 and b n .We conclude that F belongs to the same max domain of attraction as that of G.The same argument applies to min domain of attraction.That is, F belongs to the same max domain of attraction as that of G.

Entropies
An entropy is a measure of variation or uncertainty of a random variable X.Two popular entropy measures are the Rényi and Shannon entropies Renyi (1961) , Shannon (2001).The Rényi entropy of a random variable with pdf f (x) is defined as for γ > 0 and γ = 1.The Shannon entropy of a random variable X is defined by It is the special case of the Rényi entropy when γ ↑ 1. Direct calculation yields First we define and compute Using generalized binomial expansion and then after some algebraic manipulations, we obtain Proposition 3 Let X be a random variable with pdf (5).Then, The simplest formula for the entropy of X is given by After some algebraic manipulations, we obtain an alternative expression for I R (γ) where Y i ∼ Beta(γ(α − 1) + 1, γj + 1) and

Order statistics
Order statistics make their appearance in many areas of statistical theory and practice.Suppose X 1 , . . ., X n is a random sample from the AGT-G family of distributions.We can write the density of the ith order statistic, say X i:n , as Following similar algebraic manipulations, we can write the density function of the i th order statistic, X i:n , as where h r+k+1 (x) denotes the exp-G density function with power parameter r and c k is defined in equation ( 29).Here, the quantities f j+i−1,k are obtained recursively by f j+i−1,0 = c j+i−1 0 and (for k ≥ 1) Equation ( 41) is the main result of this section.It reveals that the pdf of the AGT-G order statistic is a linear combination of exp-G density functions.So, several mathematical quantities of the AGT-G order statistics such as ordinary, incomplete and factorial moments, mgf, mean deviation and several others can be obtained from those quantities of the exp-G distribution.

Characterization results
In designing a stochastic model for a particular modeling problem, an investigator will be vitally interested to know if their model fits the requirements of a specific underlying probability distribution.To this end, the investigator will rely on the characterizations of the selected distribution.Generally speaking, the problem of characterizing a distribution is an important problem in various fields and has recently attracted the attention of many researchers.Consequently, various characterization results have been reported in the literature.These characterizations have been established in many different directions.In this Section, we present characterizations of AGT-G distribution.These characterizations are based on: (i) a simple relationship between two truncated moments; (ii) conditional expectations of a function of the random variable.We like to mention that the characterization (i) which is expressed in terms of the ratio of truncated moments is stable in the sense of weak convergence.It also serves as a bridge between a first order differential equation and probability and it does not require a closed form of the cdf.

Characterizations based on two truncated moments
In this subsection we present characterizations of ATG-G distribution in terms of a simple relationship between two truncated moments.Our characterization results presented here will employ an interesting result due to Glänzel (1987) (Theorem 4 below).The advantage of the characterizations given here is that, cdf F need not have a closed form and is given in terms of an integral whose integrand depends on the solution of a first order differential equation, which can serve as a bridge between probability and differential equation.
Theorem 4 Let (Ω, F, P) be a given probability space and let H = [a, b] be an interval for some a < b (a = −∞ , b = ∞ might as well be allowed) .Let X : Ω → H be a continuous random variable with the distribution function F and let q 1 and q 2 be two real functions defined on H such that is defined with some real function η .Assume that q 1 , q 2 ∈ C 1 (H) , η ∈ C 2 (H) and F is twice continuously differentiable and strictly monotone function on the set H . Finally, assume that the equation q 2 η = q 1 has no real solution in the interior of H . Then F is uniquely determined by the functions q 1 , q 2 and η , particularly where the function s is a solution of the differential equation s = η q 2 η q 2 − q 1 and C is a constant, chosen to make H dF = 1.
Remarks 5 (a) In Theorem 4, the interval H need not be closed since the condition is only on the interior of H. (b) Clearly, Theorem 4 can be stated in terms of two functions q 1 and η by taking q 2 (x) ≡ 1, which will reduce the condition given in Theorem 4 to E [q 1 (X) | X ≥ x] = η (x) .However, adding an extra function will give a lot more flexibility, as far as its application is concerned.
Proposition 6 Let X : Ω → (0, ∞) be a continuous random variable and let q 2 (x) = 1 − λ + 2λ G (x) α −1 and q 1 (x) = q 2 (x) G (x) for x > 0. The pdf of X is (5) if and only if the function η defined in Theorem 4 has the form Proof.Let X have pdf (5), then and and finally Conversely, if η is given as above, then and hence Now, in view of Theorem 4, X has pdf (5).
Corollary 7 Let X : Ω → (0, ∞) be a continuous random variable and let q 2 (x) be as in Proposition 6.The pdf of X is (5) if and only if there exist functions q 1 and η defined in Theorem 4 satisfying the differential equation Proof.Is straightforward and hence omitted.
Remarks 8 (a) The general solution of the differential equation in Corollary ?? is Proof.Is similar to that of Proposition 9.

Bivariate extention
In this section we introduce a bivariate version of the proposed model.The joint cdf is given by where G(x, y; ξ) is a bivariate continuous distribution with mariginal cdf's G 1 (x; ξ) and G 2 (y; ξ).We denote this distribution by another bivariate Generalized Transmuted G (ABGT-G) distribution.The marginal cdf's are given by and The joint pdf of (X, Y ) is easily obtained by f X,Y (x, The marginal pdf's are The conditional cdf's are and .
The conditional density functions are

Estimation
Here, we determine the maximum likelihood estimates (MLEs) of the model parameters of AGT-G from complete samples only.Let x 1 , . . ., x n be observed values from the AGT-G distribution with parameters α, λ and ξ.Let Θ = (α, λ, ξ) be the r × 1 parameter vector.The total log-likelihood function for Θ is given by The log-likelihood function can be maximized either directly by using the SAS (PROC NLMIXED) or the Ox program (sub-routine MaxBFGS) (see Doornik, 2007) or by solving the nonlinear likelihood equations obtained by differentiating (47).The components of the score function U n (Θ) = (∂ n /∂α, ∂ n /∂λ, ∂ n /∂ξ) are where h (ξ) (•) denotes the derivative of the function h with respect to ξ.

Maximum product spacing estimates
The maximum product spacing (MPS) method has been proposed by Cheng and Amin (1983).This method is based on an idea that the differences (spacings) of the consecutive points should be identically distributed.The geometric mean of the differences is given as where the difference D i is defined by Here, F (x (0) , λ, α, ξ) = 0 and F (x (n+1) , λ, α, ξ) = 1.The MPS estimators λP S , αP S and ξP S of λ, α and ξ are obtained by maximizing the geometric mean (GM) of the differences.Substituting pdf of AGT-G in (49) and taking logarithm of the above expression, we will have The MPS estimators λP S , αP S and ξP S of λ, α and ξ can be obtained as the simultaneous solution of the following non-linear equations:

Least square estimates
Let x 1:n , x 2:n , . . ., x n:n be the ordered sample of size n drawn the AGT-G population pdf.Then, the expectation of the empirical cumulative distribution function is defined as The least square estimates (LSEs) λLS , αLS and ξLS of λ, α and ξ are obtained by minimizing Therefore, λLS , αLS and ξLS of λ, α and ξ can be obtained as the solution of the following system of equations:

Applications
Now we use a real data set to show that the AGT-E can be a better model than the beta-exponential (Nadarajah and Kotz (2006a)), Kumaraswamy-exponential distribution and exponential distribtuion.
The LR test statistic to test the hypotheses H 0 : λ = 0 &α = 1 versus H 1 : λ = 0 &α = 1 for data set is ω = 11.584> 5.991 = χ 2 2;0.05 , so we reject the null hypothesis.In order to compare the two distribution models, we consider criteria like Kolmogorov-Smirnov (K-S) statistics, −2 , AIC (Akaike information criterion), and AICC (corrected Akaike information criterion) for the data set.The better distribution corresponds to smaller KS, −2 , AIC,AICC and BIC values: , where, F n (x) is the empirical distribution function, • AIC = −2 log x ∼ , α, λ, ξ + 2p, • AICC = AIC + 2p(p+1) n−p−1 , • BIC = −2 log x ∼ , α, λ, ξ + p log (n) , where, p is the number of parameters are to be estimated from the data and n the sample size.Also, here for calculating the values of KS we use the sample estimates of α, λ and γ.Table 3 shows the MLEs under both distributions, Table 3 shows the values of KS, −2 , AIC, AICC, and BIC values.The values in Table 3 indicate that the AGT-E leads to a better fit than the exponential distribution.
The P-P plots, fitted distribution function and density functions of the considered models are plotted in Figures 3 and 4, respectively, for the data set.

Figure 1 :
Figure 1: The pdf's of various AGT-E distributions .
c , for every x > 0. So, it follows from Leadbetter et al. (2012) (chapter 1) that F belongs to the max domain of attraction of the Gumbel extreme value distribution with lim n→∞ P [a n (M n − b n ≤ x)] = exp(−x α c ) for some suitable norming constants a n > 0 and b n .Third, suppose G belongs to the max domain of attraction of the Weibull extreme value distribution.Then, Leadbetter et al. (2012) (chapter 1), there must exist a β > 0 for every x < 0. So, it follows from Leadbetter et al. (2012) (chapter 1) that F belongs to the max domain of attraction of the Weibull extreme value distribution with lim n→∞ P [a n (M n − b n ≤ x)] = exp −(−x) β

Table 2 :
Estimated parameters of the AGT-E, BE and KwE distribution for data set.

Table 3 :
Criteria for comparison.