Compound Conway-Maxwell Poisson Gamma Distribution: Properties and Estimation

The distribution of a random sum of random events is called a compound distribution. It involves a counting (discrete) distribution to model the number of occurrences of the random event in a fixed time period and a continuous distribution to model the outcome of the random event. It has applications in the fields of actuarial sciences, meteorology etc. For example, in modelling insurance loss amounts through compound distributions, the number of claims and the claim amounts are used to calculate the total claim amount of a portfolio. The number of claims is modelled through a discrete distribution and the claim amounts are modelled through continuous distributions. Generally, Poisson distribution is used in compound models as the discrete distribution and such models are known as compound Poisson models. However, the equi-dispersion property of the Poisson distribution hinders its application in scenarios where the underlying count data is either over-or under-dispersed. In this paper, a two-parameter Poisson distribution, namely, Conway-Maxwell Poisson (CMP) distribution, which handles both over-and under-dispersed data, is considered as the counting distribution, and the corresponding compound CMP distribution is developed. Some mathematical properties of the distribution are derived and a methodology to estimate the parameters using the likelihood approach is proposed. A numerical illustration of the proposed methodology is given through a simulation study. An application of the compound CMP model is illustrated through transportation security administration (TSA) insurance claim data. Also, the estimation of the risk measures associated with the TSA claim data is discussed.


Introduction
Let N be the counting random variable, and let X i , i = 1, 2, . . .be a sequence of independent and identically distributed positive random variables.Then the distribution of the random sum is called a compound distribution.Here X i s and N are assumed to be independent.Com-pound distributions have potential applications in various domains, see, for example, Gómez-Déniz and Pérez-Rodríguez (2019), Silva and Windmeijer (2001) in the domain of tourism and health economics, and Withers and Nadarajah (2011) in the domain of meteorology.The present work is motivated by the application of compound distributions in the domain of actuarial science to model aggregate claim amounts (S) during a fixed policy period.In this framework, N denotes the number of claims made on a portfolio of insurance policies, and X i , i = 1, 2, . . ., N denotes the corresponding individual claim amounts.Often in modelling S, the distribution of N is assumed to be Poisson, in which case, the distribution of S given in equation 1 is said to have a compound Poisson distribution.For detailed discussion on compound Poisson distribution, one may refer to Klugman, Panjer, and Willmot (2012) and Bahnemann (2015).Assuming the distribution of X i s to be two-parameter gamma, Withers and Nadarajah (2011) have derived the compound Poisson-gamma distribution and studied its properties.Since Poisson distribution has equi-dispersion property, the above model can be used when the underlying count data are equi-dispersed.Quite often, in modelling aggregate claim amount, the claim frequencies are over-or under-dispersed.In such situations, a modification of Poisson distribution that can accommodate over-and under-dispersed data is needed.One such distribution is the Conway-Maxwell Poisson (CMP) distribution which is a two-parameter modification of Poisson distribution that includes Poisson, Bernoulli and geometric distributions as its special cases.Originally introduced by Conway and Maxwell (1962), it has been revisited by Shmueli, Minka, Kadane, Borle, and Boatwright (2005) to discuss its properties and applications.The probability mass function (pmf) of CMP distribution is given by where is the normalizing constant.Here, λ denotes the location parameter and ν denotes the dispersion parameter that captures the degree of over-or under-dispersion.The CMP distribution is over-dispersed for ν < 1, under-dispersed for ν > 1, and equi-dispersed for ν = 1.For more details on CMP distribution, one may refer to Sellers, Borle, and Shmueli (2012).In the present work, the distributions of N and X i s in equation ( 1) are assumed to be CMP and gamma respectively.The random sum S is then said to have a compound CMP-gamma (CCMPG) distribution.Unlike the compound Poisson gamma distribution, CCMPG distribution can be used to model S when the underlying count data is over-or under-dispersed.The rest of the paper is organized as follows.In Section 2, the CCMPG distribution is derived.The expressions for the moment generating function (mgf), cumulant generating function (cgf), raw moments and cumulants are obtained in Section 3. In Section 4, the maximum likelihood (ML) estimation of the parameters of the CCMPG distribution is discussed, and a 2-step methodology to estimate the model parameters is proposed.Section 5 deals with interval estimation of the parameters through bootstrap approach.A simulation study to illustrate the computation of the ML estimates is presented in Section 6. Section 7 contains a real-life application of the CCMPG distribution to travel insurance claims data.Concluding remarks are given in Section 8.

Compound CMP gamma distribution
Let N represent the number of claims arising from an insurance portfolio in a given period having pmf as given in equation ( 2).Let X i , i = 1, 2, . . ., N denote the individual claim amounts.Then the total claim amount S is modelled as given in equation (1).Note that N = 0 ⇔ S = 0.It is assumed that the individual claim amounts (X i s) are independent of each other and are also independent of N .
Let p 0 denote the probability mass at S = 0. Since S is not continuous at zero, the probability density function (pdf) of S is represented in terms of generalized pdf as where and Here, δ(.) is the Dirac delta function such that ∞ 0 δ(s)ds = 1 f * i (x) denotes the i-fold convolution of X i s.For more details on Dirac delta function, one may refer to Pishro-Nik (2016).Since X i ∈ (0, +∞), it can be modelled by a continuous distribution having support in R + .Gamma distribution is one such distribution that is often used in modelling claim amounts since it has positive support, positively skewed with a light tail.We consider the distribution of the individual claim amounts to be a two-parameter gamma having pdf where α is the scale and β is the shape parameter.Since gamma distribution is closed under convolution, we have where U = y i=1 X i .Using equations ( 2) and (7) in equation ( 4), we get Substituting equation (8) in equation ( 3), the pdf of S is obtained as where the function r(y) is defined as . The random variable S with pdf as defined in ( 9) is said to have a CCMPG distribution with parameters λ, ν, α, and β.The function r(y) can be expressed in terms of generalized hypergeometric function as shown in Result 1.A special case of this distribution is the compound Poisson-gamma distribution (Withers and Nadarajah 2011) where R(z) is defined as Here 0 F q is the generalized hypergeometric function defined as The derivative and the summation in the above steps are interchanged because Taking z in equation ( 11) to be y β β and multiplying β, we get Hence the result.
Since generalized hypergeometric functions are not divergent, the function r(.) defined in equation ( 9) is not divergent.For β = 1, ν = 0.2, the plots of r k (y) for different choices of y are shown in Figure 1, where r k (y) denote the truncated sum of r(y) and is defined as It is observed from the plots that the values of r k (y) approach to a constant for increasing values of k.

Moment generating function
Consider S = N i=1 X i , where N and X i s are independent.Since X i s are independent, denoting X ≡ X i , the moment generating function (mgf) of S is given by where M N (.) is the mgf of CMP distribution and M X (.) is the mgf of gamma distribution with parameters (α, β).From equations ( 2) and ( 6), M N (.) and M X (.) are obtained as follows. and Substituting the above mgfs in equation ( 12), we get The r th raw moment of S is obtained from the mgf as Following Nadarajah (2009), the mgf and the moments of CCMPG distribution when ν ∈ Z + can also be expressed in terms of generalized hypergeometric functions as follows.
Consequently, the expectation and the variance of S using equation ( 15) are Writing the normalizing constant of the CMP distribution in terms of modified Bessel function as given in Daly and Gaunt (2016) for ν = 2, the mfg of CCMPG distribution can be expressed as where I r (.) denote the modified Bessel function defined as The mean and variance based on the above mgf are obtained respectively as 3.1.Cumulants using asymptotic expansion of the normalizing constant Gillispie and Green (2015) have shown that the asymptotic expansion of Z(λ, ν) for a fixed ν as The above expansion is used to obtain asymptotic expressions for the cumulants of the CCMPG distribution.
Result 2. The m th cumulant of CCMPG distribution is given by Proof.Using equation ( 13), the cumulant generating function of CCMPG distribution is The m th cumulant is given by Using equation ( 16), we get Differentiating the above m times with respect to t and taking t = 0, the expression for κ m is obtained.
Consequently, the asymptotic expressions for the mean and the variance of S as λ → ∞ are given by Using Result 2, the skewness and kurtosis of CCMPG distribution as λ → ∞ are obtained, respectively as and

Point estimation
In this section, estimation of the parameters of S by the method of moments and maximum likelihood is discussed.Let θ = (λ, ν, α, β) ′ .

Method of moment estimation
Using equation ( 14), the first four moments of S are derived and they are used to obtain the moment estimates.The moment equations are as given below.(20) where m ′ a denote the a th sample moment.The moment estimates of θ can be obtained by solving equations ( 17) to (20) simultaneously.

Method of maximum likelihood estimation
Let s = (s 1 , s 2 , . . ., s p ) be a random sample of size p from the CCMPG distribution.Let there be M (> 0) positive values in s and T = p − M zeros.Following the approach given in Withers and Nadarajah (2011), we observe that M ∼ Binomial(p, 1−p 0 ) and p 0 = Z(λ, ν) −1 .Then (M = 0) = Z(λ, ν) −1 p .Therefore the likelihood function based on s and M = m is The corresponding log-likelihood function is The score functions of the parameters λ, ν, α and β equated to zero are given in equations ( 22) to (25) respectively.
It is clear from the method of moments and likelihood estimation that obtaining closed-form expression for the estimators of the parameters is not possible.Hence, one has to resort to iterative procedures involving Newton-Raphson method to obtain the estimates.Due to the complex nature of the equations ( 17) to (20) and equations ( 22) to (25) which involve infinite sums, implementation of root finding algorithms might not yield reasonably good estimates.
To make the estimation procedure less complex, in the sequel, a 2-step ML methodology is proposed assuming that data on N is known.

ML estimation when N is known
Since the ML estimators do not have closed-form expression, in this section, a 2-step ML methodology is proposed to estimate the parameters assuming N to be known.The merit of assuming N to be known is that the estimates of the parameters λ and ν can be directly obtained from the CMP distribution.Let n i and s i , i = 1, 2, . . ., p denote claim frequencies and aggregate claim amounts for the i th claim.The proposed methodology is as follows.
Step 1: Using the observed claim frequencies, obtain the ML estimates for the parameters λ and ν of CMP distribution defined in equation ( 2).
Step 2: The log-likelihood function of S given in equation ( 21) is then maximized to obtain the estimates of α and β after plugging in λ and ν obtained in Step 1.

Interval estimation
This section provides the interval estimation of θ.As observed in the previous section, the ML estimators are not in closed form and hence the exact distributions of θ cannot be easily derived.Therefore, the confidence intervals (CI) of the estimators are computed using the parametric bootstrap technique.Following are the steps to obtain the parametric bootstrap CIs for θ.
Steps 2: Draw B bootstrap samples of size p with replacement from this original sample.
The CIs are computed assuming that data on N is known.

Simulation study
In this section, the performance of the 2-step ML estimation procedure is illustrated through simulation study.Random samples are generated from the CCMPG distribution as follows.
Step (a): Generate a random sample of size one from CMP distribution for fixed λ and ν, call this value d.
Step (b): Generate d random samples from the gamma distribution defined in equation ( 6) fixing α and β.
Step (c): Sum the d gamma samples to obtain the sample observation from the CCMPG distribution.
Repeating Steps (a)-(c) p times yields pairs of observations (n i , s i ), i = 1, 2, . . ., p.If the random observation generated from CMP distribution is zero, then the corresponding sample observation of the CCMPG distribution is taken as zero.
The simulation study is designed by considering different combinations of the parameter values.To accommodate the dispersion in N , three values for ν, namely, less than, equal to and greater than one, corresponding to over-, under-and equi-dispersion are considered.The values for α and β are fixed so that the mean of the gamma distribution is less than, equal to and greater than one.The parameter choices considered for the simulation study are given below.

Real-life application
In this section, the proposed CCMPG distribution is used to model Transportation Security Administration (TSA) claims data.TSA is a United States agency created to maintain the security of the nation's air transportation systems.TSA is responsible for recording the claims of the individuals who have been injured or had items damaged, lost, or stolen.Information on every claim from 2002 to 2017 can be found at https://www.dhs.gov/tsa-claims-data.
For the illustration, monthly data on the claims from January 2004 to December 2008 are extracted as follows.
• The number of claims for each month from January 2004 to December 2008 is calculated based on the claim date.This gives the data on claim frequency N .
• For each month, the individual claim amounts (in US dollars) are summed up to obtain the aggregate claim amount, S.  The proposed 2-step methodology described in section 4.3 is applied to obtain the ML estimates of the parameter θ of aggregate claim distribution.The estimates and their corresponding standard errors (in parenthesis) are found to be λ=1.6221(0.1470), ν=0.0871 (0.0163), α= 0.00097 (0.0019) and β=0.2348 (0.4765). Figure 3

Computation of risk measures
Estimation of potential risk that could occur in a portfolio of claims is of interest to any insurance company.To determine the expected risk, percentile-based risk measures such as the value at risk (VaR) and expected shortfall (ES) are often used.VaR is defined as the maximum amount expected to be lost (or at risk) by the insurance company in a future time period with a given confidence level, say α.It is calculated as the (100×α) th percentile of the aggregate claim distribution S satisfying the relation P (S ≤ VaR) = α, α ∈ (0, 1).The expected shortfall is the average of all the aggregate claim amounts above VaR and is given by E Since the quantile function of the CCMPG distribution is not tractable, the percentiles of S are calculated by generating 100000 samples of the aggregate claim amounts by using Steps (a)-(c) of Section 6 fixing the ML estimates obtained in the previous section as the parameter values.The estimated mean and variance of S obtained through the generated samples are 63723.28and 240656272 respectively, which are closer to the mean and variance values of the observed data reported in Table 4. Using the percentiles of S, the estimated 95% monthly VaR and ES are found to be 90470.89and 98167.78US dollars, respectively.This means that the probability of losing not more than 90470.89US dollars in aggregate claim amount by the TSA agency in the future month is 0.95 and the expected shortfall to meet an aggregate claim requirement of larger than 90470.89US dollars in the future month is 98167.78US dollars.
To validate these estimated measures, the VaR and ES are computed from the TSA claims dataset based on 60 observations and are found to be 89474.79and 91128.58respectively.
The results indicate that the percentiles obtained based on ML estimates are closer to that of the observed data, demonstrating that the proposed 2-step ML methodology works well.

Concluding remarks
Modelling and analysis of the aggregate claim amount distribution has always been an important aspect in actuarial studies.Over the years, researchers have developed several aggregate claim models using Poisson distribution and its variants as the claim frequency distribution.
In the present work, the CMP distribution, which is capable of modelling over-and underdispersed claim frequencies is used as the claim frequency distribution.Correspondingly, the compound CMP Gamma distribution to model the aggregate claim amounts is developed taking the individual claim amount distribution to be gamma.A 2-step ML estimation methodology is proposed to estimate the parameters of the CCMPG distribution.The merit of the proposed methodology is that the estimates are obtained in a 2-step process rather than simultaneously estimating all the four parameters.An application of the proposed methodology to simulated data, carried out with various choices of the parameter values, reveals that the resulting estimates have smaller bias, smaller MSEs and high coverage probabilities in the case of over-as well as under-dispersed claim frequency data.The results show that the ML estimators perform well for all sample sizes.Application of the CCMPG distribution to model the travel insurance claims data resulted in a good fit.Also, the proposed estimation methodology can be used to obtain reliable estimates of two important risk measures, namely value at risk (VaR) and expected shortfall (ES), as highlighted in the real-life application.Reliable estimates of these risk measures capacitate the insurance agency to quantify and control the level of risk exposure arising from uncertainties through proper planning and administration.

Figure 1 :
Figure 1: Plot of r k (y) versus k for different choices of y.
(a)-(c) taking p = 25, 50, 100.For generating sample from CMP distribution, rcmp() function in COMPoissonReg package in R ((Kimberly, Thomas, and Andrew 2019)) is used.The ML estimates are obtained using the proposed 2-step ML methodology given in section 4.3.The estimates of the parameters (λ, ν) are obtained using glm.cmp()function in COMPoissonReg package.To obtain the estimates of α and β, maxLik() function in R is used.The number of simulation runs are taken as r = 1000.The average (AVG), mean absolute bias (MAB) and mean square error (MSE) of the estimates obtained are computed as below.

Figure 4 :
Figure 3: Fitted probability density function and histogram of S

Table 1 :
The average of the ML estimates, their MABs, MSEs, width of CI and CP for over-

Table 2 :
The average of the ML estimates, their MABs, MSEs, width of CI and CP for equidispersion caseOnly the claims which are settled and approved are considered.Thus the dataset consists of p = 60 pairs of observations (n i , s i ), i = 1, 2, ..., p.The summary statistics of the claim frequency N and the aggregate claim amounts S are given in Table4.The histogram of claim frequency is displayed in Figure2.

Table 4 :
Summary statistics of claim frequency and aggregate claim amountsIt is seen from Table4that the variance of N is larger than its mean, indicating that the claim frequencies are over-dispersed.The claim frequencies are modelled with various discrete distributions and the values of the measures of goodness-of-fit, namely log-likelihood, AIC and BIC are presented in Table5.These measures are computed using the in-built functions glm.cmp() under COMPoissonReg package and fitdistr() under MASS package available in R. From the table, it is observed that the CMP distribution provides a better fit to model the claim frequency data.

Table 5 :
Goodness-of-fit of claim frequency distribution