Testing in Nonparametric Accelerated Life Time Models

A goodness-of-fit test for testing the acceleration function in a nonparametric life time model is proposed. For this aim the limit distribution of an L 2-type test statistic is derived. Furthermore, a bootstrap method is considered and the power of the test is studied.


Introduction
We consider a life time model which describes the following situation: By some covariate X the time to failure may be accelerated or retarded relative to some baseline.Examples for X are the dose of a drug, temperature, pressure or stress.The speeding up or slowing down is accomplished by some positive function ψ, the acceleration function, and we may write , where T 0 is the so-called baseline life time and T is the observable life time.We will assume that T is an absolute continuous random variable (r.v.) and that the covariate X does not depend on the time.For simplicity of presentation let X be one-dimensional.
For statistical application a suitable choice of the acceleration function is important and the problem of testing ψ arises.A survey of test procedures for testing ψ under different model assumptions is given in Liero and Liero (2008).
The aim of the present paper is to propose and to discuss in more detail test procedures for testing whether the function ψ belongs to a pre-specified parametric class of functions The test is based on the transformation of the life time model to a regression model.Then the test problem is a problem of checking whether a regression function has a parametric form or alternatively that ψ is nonparametric.As test statistic for this goodnessof-fit test an integrated squared distance between regression estimators characterizing the hypothesis and the alternative is discussed in Section 2. For the formulation of the corresponding asymptotic α-test a limit theorem is stated.In Section 3 a bootstrap procedure is proposed and illustrated.The behavior of the power under local alternatives is investigated in Section 4. Proofs are given in Section 5.

Test Statistic and its Limit Distribution
Let (T i , X i ), i = 1, . . ., n be independent copies of the pair (T, X).With the logarithm transformation Y i = log T i we obtain the nonparametric regression model With this transformation the testing problem As test statistic we propose the weighted L 2 -distance between a parametric estimator for the hypothetical m ∈ M and a nonparametric estimator for m.
First, consider the parametric estimator in the hypothetical model Assuming ψ(0) = 1 for identifiability we estimate β and µ by the least squares method, that is To characterize the regression under the alternative we choose an estimator which is good for all possible regression functions, i.e. we apply nonparametric techniques.Nonparametric regression estimators can be written as weighted average of the response variables Y i , Roughly speaking, if we estimate m at x the weights W bni are chosen so that those Y i for which X i is near to x get a large weight.This estimation idea is realized by the well-known Nadaraya-Watson kernel estimator with weights Here K : R → R is a kernel function, and b n is a sequence of bandwidths tending to zero as n → ∞.
Several asymptotic properties as consistency, asymptotic expression for the mean squared error and limit theorems are investigated.For our purpose we formulate the following limit theorem about the asymptotic normality of the stochastic part of the weighted integrated squared error Here a is a known weight function and M is a bounded interval; the symbol X stands for the sample X 1 , . . ., X n .
The limit behavior of Q n was considered by several authors such as Collomb (1976), Liero (1992), and Härdle and Mammen (1993).For a proof of the version given here the reader is referred to Liero (2008).
Theorem 1 Suppose that the following conditions are satisfied: 1  (1) The density g of the covariate X is Lipschitz continuous in a neighborhood of M ; further there exist constants c and C, such that (2) The function ψ is Lipschitz continuous in a neighborhood of M .
(3) There exists a number ζ > 4 such that E| log (4) The kernel K has a compact support, is Lipschitz continuous and satisfies (5) The weight function a is bounded and piecewise continuous.
(6) The bandwidth sequence b n is a sequence of positive numbers with where κ = K 2 (x)dx and κ * = (K * K) 2 (x)dx ( * denotes the convolution).
In the limit theorem we consider only the stochastic part of the deviation of the estimator mn from the underlying regression function.This is done in order to avoid the bias of the nonparametric estimator.The consequence for the definition of our test statistic is that we do not compare mn with the hypothetical regression with estimated parameters m(•; βn , μn ) but with its smoothed version 1 With " D →" we denote convergence in distribution.

84
Austrian Journal of Statistics, Vol. 37 (2008), No. 1, 81-90 In other words, the test statistic has the form Now, from the theory of least squares estimation we know that under regularity conditions the estimators βn and μn are √ n-consistent.Thus, the limit statement following from Theorem 1 for Q n (β, µ) with fixed β and µ remains true if we replace the parameters by these √ n-consistent estimators and we have (1) To apply the limit statement (1) for the formulation of the asymptotic α-test we have to replace the unknown variance σ 2 and the marginal density g in the standardizing terms η n and τ 2 n by appropriate estimators.The function g can be estimated by a Rosenblatt-Parzen kernel estimator.For the estimation of σ 2 one can use the parametric estimator of the variance based on the residuals in the hypothetical model.Another possibility is to choose a nonparametric estimator.In Liero (2003) estimators for the variance in a homogeneous nonparametric regression model are considered.One proposal is to estimate σ 2 by where X j:n denotes the jth order statistic.Furthermore, note that under the assumptions of Theorem 1 the standardizing terms are asymptotically equivalent to Let ηn and τn be defined as η n and τ n with the unknown terms being replaced estimators.We can formulate the following consequence: Corollary 1 Suppose that the assumptions of Theorem 1 are satisfied and that ψ(•) = ψ(•; β) for some β, i.e.H holds.Further assume the following (2) (2) The function ψ is partially differentiable w.r.t.β.The functions B j defined by are uniformly continuous with respect to both arguments and satisfy for j, l = 1, . . ., d.
Based on (4) the following asymptotic α-test is proposed: where z α is the (1 − α)-quantile of the standard normal distribution.

A Bootstrap Proposal
The error which occurs by approximating the distribution of the test statistic by the standard normal distribution depends not only on the underlying distribution and the sample size n but also on the smoothing parameter b n .Thus, it can happen, that this approximation is not good enough, even when n is large.So let us consider the following modelbased resampling procedure: 1.With the l.s.e.μn and βn (constructed with the original data) define the modified residuals where ŷi are the fitted values and h i are the leverages.
2. For r = 1, . . ., R: (a) For i = 1, . . ., n: to construct the nonparametric estimate m * rn and the smoothed estimated hypothetical regression m * rn .(c) Evaluate the distances Q * rn and the standardized statistic Austrian Journal of Statistics, Vol. 37 (2008), No. 1, 81-90 The following figure illustrates this procedure.A sample of n = 100 exponential distributed r.v.'s with hypothetical acceleration function defined by ψ(x; β) = exp(−βx) was generated.In the histogram of the R = 1000 bootstrapped V * 's the quantile V * [0.95R]n , the quantile based on the limit distribution and the value of the standardized test statistic are given.The second figure shows the result for a simulation with the alternative acceleration function ψ(x; β) = exp(−βx−sin(πx/2)).In this case the hypothesis is rejected by the bootstrap procedure, in difference to the procedure based on the asymptotic quantile.

Histogram Vstar H false
In this simulation the bootstrap method works.Of course, an example is not a justification of this bootstrap method in general.However, it shows that resampling can be an appropriate tool.Therefore it seems to be useful and necessary to get a deeper theoretic insight into resampling and simulation procedures to justify their application.

The Power under Local Alternatives
The proposed test is consistent, i.e. if the underlying acceleration function ψ does not belong to the class F, then the power of the test converges to one as n → ∞.To study the distinguishability of the test it seems to be useful to investigate the behavior of the power under local alternatives.That is we consider a sequence of acceleration functions, say where β is arbitrarily fixed.For simplicity of presentation we assume that the parameter β is known.The power of the proposed L 2 -test under this alternative is given by where P K n is the probability measure of the r.v.(T, X) with the conditional survival function defined by S 0 (tψ K n (x)).We can formulate the following result: Theorem 2 Suppose that the assumptions of Theorem 1 are satisfied and that ∆ n is Lipschitz continuous.Then the power is characterized by the sequence in such a way that where Let us consider the special case that the function ∆ n is defined by with λ n being a sequence of positive numbers tending to zero.Then the "crucial" sequence n has the form n = nb 1/2 n λ 2 n M D 2 (x)a(x)dx, and the power tends to a nontrivial limit, i.e. lim n π n ∈ (α, 1), if λ n = n −1/2 b −1/4 n c.The formula for λ n suggests that a larger bandwidth would increase the power.However, a large bandwidth leads to oversmoothing.Moreover, the approximation of the distribution of the test statistic by the standard normal distribution improves if the bandwidth tends to zero faster.Thus, the choice of the bandwidth requires some deeper investigation but this issue is beyond the scope of this paper.
Remark: If we do not assume that the parameter β is known we have to study the behavior of the least square estimator βn (constructed in the hypothetical model) under the alternative.In general it will turn out that the √ n-consistency does not remain true and consequently the error of parameter estimation will occur within the power.