Kernel-based Estimation of Ageing Intensity Function: Properties and Applications

The notion of ageing plays an important role in reliability and survival analysis as it is an inherent property of all systems and products. Jiang, Ji, and Xiao (2003) proposed a new quantitative measure, known as ageing intensity (AI) function, an alternative measure to study the ageing pattern of probability models. In this paper, we propose a non-parametric estimator for ageing intensity function. Asymptotic properties of the estimator are established under suitable regularity conditions. A set of simulation studies are carried out based on various probability models to examine the performance of estimator and to establish its eﬃciency over the classical estimator. The usefulness of the estimator is also examined through a real data set.


Introduction
Let X be a non-negative random variable representing lifetime of a living organism or component/system with an absolutely continuous cumulative distribution function F (.), survival function F (.) = 1 − F (.) and hazard function h(.) = f (.) F (.) .Then AI function of X is defined as the ratio of hazard rate to a baseline hazard rate.When the baseline hazard rate is average hazard rate 1 t t 0 h(u)du, the AI function is defined by where H(.) denote the cumulative hazard rate function.The larger the value of L(.), the stronger the tendency of ageing of the random variable X.Also, L(t) = c, for t > 0, c being a constant, characterizes the family of Weibull distribution with shape parameter c.It is to be noted that the hazard rate function uniquely determines the AI function but not conversely.
(2) Jiang et al. (2003) introduced AI function as a quantitative measure in determining the ageing behaviour of a lifetime random variable.Jiang et al. (2003) proposed a general theo-retic framework to optimal burn-in problem when the hazard rate of a product is unimodal.It includes three main components.First, any unimodal hazard rate can be viewed as either quasi-decreasing, quasi-constant, or quasi-increasing.Secondly, in the ageing tendency analysis, AI function reveals the relationship between the ageing property and the model parameters.Thirdly, the AI function provide two critical values of the model parameters, indicating the partitions between different ageing features.Nanda, Bhattacharjee, and Alam (2007) proposed an ageing intensity ordering and studied its closure properties under different reliability operations, for the formation of k-out-of-n system, and increasing transformations.Sunoj and Rasin (2018) have introduced a quantile-based ageing intensity function and obtained various ageing properties and ordering relationships.Recently, a family of generalized ageing intensity functions of univariate absolutely continuous lifetime random variables has been studied by Szymkowiak (2019).For more results and generalizations of AI function, one can refer to Bhattacharjee, Nanda, and Misra (2013a,b), Misra and Bhattacharjee (2018), Szymkowiak (2018Szymkowiak ( , 2020)), Sunoj, Nair, Nanda, and Rasin (2020), and the references therein.
In the present paper, we propose a non-parametric kernel estimator for ageing intensity function for complete sample in the identically and independently distributed case and numerically establish that it is efficient than the classical empirical estimator.The paper is organized as follows.In Section 2, we propose a non-parametric estimator for L X (t) and obtained its Bias and Mean Squared Error (MSE).In Section 3, the asymptotic properties of the estimator are studied under suitable regularity conditions.In Section 4, simulation studies are is carried out using different probability models to validate the efficiency of the estimator and to examine the usefulness of the estimator in real situations.

Non-parametric estimation of L(t)
Let {X i ; 1 ≤ i ≤ n} be a sequence of identical and independent random variables representing lifetimes for n components or devices.X i s have a common absolutely continuous distribution function F (.), probability density function f (.) = F (.), survival function F (.) = 1 − F (.), hazard rate function h(.) and cumulative hazard rate H(.

Classical estimator
For failure data, let N units be put to test at t = 0. Further, let the number of units having survived at ordered times t j be N s (t j ); j = 0, 1, 2, . . ., k.Then a classical estimate for L X (t), for t > 0, is , for t j < t < t j + ∆t j .
(3) Szymkowiak (2018) has also analyzed L n (t) through generated data, real complete data and censored data sets.Further, a kernel-based estimation of AI function was initially proposed by Misra and Bhattacharjee (2018) through a case study of bone marrow transplantation data of leukemia patients in which certain observations censored.The estimator proposed by Misra and Bhattacharjee (2018) made use of plug-in estimator for hazard rate of Nelson-Aalen, without studying the asymptotic properties of the estimator.Motivated with these, in the present study we propose a non-parametric kernel-based estimator for AI function in the complete sample case and study its asymptotic properties.

Kernel-based estimator of L(t)
From equation (1), L(t) is the ratio of two functions of t.Using Chen, Hsu, and Liaw (2009) for the iid case, a new non-parametric estimator of L(t) denoted as L * n (t) by combining the kernel estimators of h(t) and H(t), denoted respectively h n (t) and H n (t) becomes, Here, h n (.) is the kernel density estimator of h(.) (see Roussas (1989)) and H n (.) is the corresponding estimator of cumulative hazard function H(.), and h n (.) is obtained from where f n (t) is the kernel density function of f (.), defined by, with kernel function K(.) and band width b, and Fn (.) = 1 − F n (.) is the empirical survival function, with the empirical distribution function Then, is an integral estimator of H(t).Applying Taylor series expansion to 1 H n (t) in (4), we obtain where dτ .By using equation ( 5), ( 8) & (9), the estimator L * n (t) in (4) becomes,

Asymptotic properties
In this section, we prove the consistency, asymptotic normality of L * n (t).Theorem 3.1.The Bias and MSE of L * n (.) are respectively obtained as and Proof: Using (1) and ( 10), we obtain From ( 13), we get Since observations are assumed to be independent, we have In the following theorem we prove the week consistency of L * n (.).Theorem 3.2.Assume that as h → 0, n → ∞ we have nh → ∞ and 1 nh → 0. Then Bias (L * n (t)) → 0 and M SE (L * n (t)) → 0.
Proof: From (Rao (2014), p.184) and Chen et al. (2009) we have, By Cauchy-Schwartz inequality, we have the following inequalities and Then (11) becomes, Under the regularity conditions stated in the statement (Theorem 3.2) and from Chen et al. (2009) and Rao (2014), E(R n ) and E(R 2 n ) are becomes negligible and we have h n (t) and H n (t) are consistent estimators for h(t) and H(t) respectively.Hence the Bias (L n (t)) → 0, That is, L n (t) is one of asymptotically unbiased estimators for L(t).
Definition 3.1.A sequence θ n of estimators is integratedly consistent in quadratic mean if the mean integrated squared error (MISE) tends to zero for every θ ∈ Θ, a family of univariate θ's, that is, The selection of bandwidth is also a key factor in density estimation.There are many optimization techniques available in literature to choose most appropriate bandwidth corresponding to a given kernel.The most common optimality criterion used to optimize the bandwidth h is by minimizing the expected risk function, also termed the mean integrated squared error (MISE).For L * n (t), for some fixed t > 0, it becomes Theorem 3.3.Under the assumptions of Theorem 3.2, L * n (t) is intergratedly consistent in quadratic mean.That is, MISE of L * n (t) (for some fixed t > 0) tends to zero as n → ∞.
Proof: Under weak assumptions on f and K, (f is the, generally unknown, real density function ) where AM ISE is the asymptotic M ISE which consists of the two leading terms where R(g) = g(t) 2 dt for a function g, m 2 (K) = t 2 K(t) dt and f is the second derivative of f .The minimum of this AM ISE is the solution to this differential equation.
Next we prove the asymptotic normality of L * n (t).
Proof: Applying Chen et al. (2009) for the iid observations, we have Since both h n (t) and H n (t) are asymptotically normal, and using the Remark 4 in Chen et al. (2009), the proof is complete.

Numerical study
In this section, we investigate the performance of L * n (t) using simulation and real data sets.

Simulated study
Let (X 1 , X 2 , . . ., X n ) be a random sample of size n taken from an exponential distribution with mean 0.5.We have carried out a series of m = 1000 simulations each of sample of sizes n = 30, 50 and 100, and estimated L(t) for a fixed t as (L 1 (t), L 2 (t), . . ., L m (t)).Then we compute the Bias and MSE for each of these sample sizes.We consider 9 percentiles P i s, of t, which has exactly i out of 10 observations are below P i and 10 − i out of 10 is greater than P i .
For estimation purpose we choose the Gaussian symmetric kernel 2 ) 2 and the Silverman's rule (b = 1.06 σ n −1/5 ) has been applied to determine bandwidth b.

Concluding remarks
In this article, we have proposed a non-parametric estimator for AI function based on the kernel density method.We have obtained expressions for Bias and MSE of the proposed estimator and studied its asymptotic behaviours.For numerical illustration using simulated data, we have estimated AI function based on some important life distributions such as exponential, Weibull, Gompertz, Makeham and inverse Lomax models and compared it with the classical estimator of AI function.It has been found that the proposed estimator has better performance compared to the classical estimator in terms of Bias and MSE.A real-life application has also been considered to illustrate and validate the usefulness of the estimator in studying the ageing behaviour of a lifetime random variable and in identifying the underlying probability model based on its characteristic property.We expect that the proposed estimator will be useful for reliability practitioners and theoreticians in modelling and analysis of the ageing behaviour of lifetime data.

Figure 3 :
Figure 3: Q − Q Plot for L * n (•) of simulated exponential data

Figure 6 :
Figure6: Plots of AI functions based on the data fromKundu and Manglick (2004) Table 1 provides the Bias and MSE (in parentheses) of L n (t) and L * n (t) and Relative Efficiency (RE) which is RE

Table 1 :
Bias and MSE of L n (•) and L * Table 2 to 5 explain that in most cases, the RE of L * Also, the MSE of L n (•) is much more scattered than L * n (•) as evident from Figures 5 and Table 2 to 5 corresponding to Weibull, Gompertz, Makeham and inverse Lomax distributions.This enable us to conclude that L * n (•) over L n (•) is greater than unity, shows that L * n (•) outperforms L n (•).n (•) performs better than L n (•) in terms of Bias and MSE.

Table 2 :
Bias and MSE of L n (•) and L *

Table 4 :
Bias and MSE of L n (•) and L * n (•) of Makeham Distribution

Table 5 :
Bias and MSE of L n (•) and L * n (•) of Inverse Lomax Distribution