An Approach to Robustness Evaluation for Sequential Testing under Functional Distortions in L1-metric

The problem of sensitivity analysis for the sequential probability ratio test under functional distortions of the observation probability distribution is considered. For the situation where distorted densities of the log likelihood ratio statistic belong to ε-neighborhoods of hypothetical centers in the L1-metric the least favorable distributions that maximize the conditional error probabilities are constructed. The instability coefficient is obtained to enable robustness evaluation for the sequential probability ratio test and its modification – trimmed sequential probability ratio test.


Introduction
The sequential approach to hypothesis testing (Wald 1947) is applied in various practical problems of statistical data analysis (Mukhopadhyay and de Silva 2009).If hypothetical suppositions are fulfilled, sequential tests require less observations at average in comparison with classical analogues based on the fixed number of observations, to provide the fixed small levels of error probabilities.However, in practice there are distortions in statistical data, i.e. the factual probability distribution of observations deviate from the hypothetical model (Kharin and Voloshko 2011).Therefore it is important to characterize the influence of the distortions on the error probabilities.
Similar problems of robustness analysis were investigated in Kharin (2002), Kharin and Kishylau (2005), Kharin (2013a) for discrete data under "contamination" (Huber and Ronchetti 2009).The problems of robustness analysis and of robust decision rules construction for case of composite hypotheses are investigated in Kharin (2008), Kharin (2011a) using the methodology of the asymptotic expansion construction for the characteristics w.r.t. the small parameter of distortion developed in Kharin and Shlyk (2009), Kharin (2005).
In Chernov and Kharin (2013) error probabilities of the sequential probability ratio test (SPRT) under functional distortions described by neighborhoods in the L 2 -metric were studied.
In this paper we consider the case of continuos probability distribution of observations and analyze the influence of the distortions in the L 1 -metric on the error probabilities of the SPRT.For a given maximal possible distance between the factual and the hypothetical probability distributions of the log likelihood ratio statistic the least favorable distributions (LFD) that maximize the conditional error probability of the SPRT are constructed.This maximal value of the error probability is required for the quantitative robustness analysis of sequential tests.
There are two simple hypotheses concerning the unknown value of the parameter θ: (1) Denote the accumulated log likelihood ratio test statistic: where is the logarithm of the likelihood ratio statistic calculated for the observation x k , k ∈ N.
To test hypotheses (1) by observations x 1 , x 2 , . . . the SPRT (Wald 1947) can be used: where N is the random stopping time; at this time point the decision d is made according to (5).In (4) the parameters C − , C + ∈ R are the test thresholds defined according to Wald (1947): where α 0 , β 0 ∈ (0, 1 2 ) are given maximal admissible values of probabilities of type I (to accept H 1 provided H 0 is true) and II (acceptance of H 0 provided the true hypothesis is H 1 ) errors respectively.
It is known that α 0 and β 0 are only approximate values of the factual error probabilities α(f ) and β(f ) of types I and II for the SPRT (4) -( 6) (see Wald 1947) and can deviate from α(f ) and β(f ) significantly (Kharin 2013a).
Without loss of generality, suppose that the hypothesis H 0 is true, so the value of the type I error probability α is considered.To make formulation shorter, introduce the simplified notation: where P H 0 {•} means the probability under the hypothesis H 0 .Let the probability density function p λ (x) corresponds to the cumulative distribution function F λ (x).

Inequalities for Error Probabilities of the SPRT
Let x(ω) and y(ω) be random variables on some probability space (Ω, F, P) with some probability density functions a(x) and b(y) respectively; let also 1 A (•) be the indicator function of the set A.
Proof.It follows from the Lemma condition that From ( 5) we have where N is the random stopping time.Because of ( 7) we get the relation between the random events: Lemma 2 If the inequality λ(x) ≥ λ(y) is satisfied for then the inequality α(a) ≥ α(b) holds.
Proof.From the norm conditions for a(•), b(•) we have: Using these equations denote Note that if p = 0, then a(•) and b(•) coincide, if p = 1, they are orthogonal in the sense that a(x)b(x) = 0, ∀x.
The norm condition is satisfied for the functions determined by (8): The p.d.f.s p ξ + (•) and p ξ − (•) are orthogonal, and Construct random variables ξ a = ξ a (ω), ξ b = ξ b (ω) on (Ω, F, P): The p.d.f.s of random variables (9) can be found by ( 8): Analogously we get From the construction of ξ − , ξ + and the condition of this Lemma it follows that λ(ξ Analyze now the two available cases using (9).

Robustness Evaluation for SPRT
Let the hypothetical model described in Section 1 be not satisfied, so the log likelihoods λ n = λ(x n ), n ∈ N, are independent and identically distributed random variables with some p.d.f.pλ (x), that may deviate from the hypothetical p.d.f.p λ (x), but the distance between pλ (x) and p λ (x) in the L 1 -metric does not exceed ε: where 0 ≤ ε ≤ ε 0 , and the maximal admissible deviation ε 0 is a priori known.
Let us construct the least favorable probability distribution of λ n , i.e. the p.d.f. that maximizes the value of α(•, ε) within the set L 1 (p λ , ε).
Calculate now the instability coefficient κ (Kharin 2013b) that characterizes the relative increment of the type I error probability for the SPRT under distortion ( 12) from the hypothetical version: Corollary 2 The instability coefficient for the error type I probability of the SPRT is equal to Proof.The result follows from Lemma 3, Theorem 1 and Corollary 1.

Robustness Evaluation for Trimmed SPRT
To decrease the influence of distortions on the error probabilities of the test (4), ( 5) we construct the trimmed probability density function p λ (x) for the log likelihood (3) following the idea of Kharin (2002): where g − , g + ∈ R, g − < g + , are some trimming parameters for λ n ; Note that the function p g λ (x) defined by ( 17) is some probability density function as it is nonnegative and the norm condition holds: The sequential test ( 4) -( 6) constructed using the test statistic with the trimmed probability density function ( 17) instead of λ(•) will be called the trimmed SPRT.If g − = −∞ and g + = +∞, then the trimmed p.d.f.p g λ (•) coincides with p λ (•), i.e. we have no trimming.Prove now that if the p.d.f.pλ (•) belongs to the ε-neighborhood in the L 1 -metric of the function p λ (•), then the trimmed p.d.f.pg λ (x) belongs to the ε-neighborhood of the function p g λ (•) in the same metric.
Proof.Using ( 17), ( 18) evaluate the distance: that proves the statement of the Lemma.
Proof.The Theorem statement follows from Lemma 4 and Theorem 1.
Proof.The Corollary statement follows from Lemma 4 and Theorem 1.Now calculate the instability coefficient (Kharin 2011b) for the type I error probability of the SPRT under distortion (12).
Corollary 4 The instability coefficient for the error type I probability of the trimmed SPRT is equal to Proof follows from Lemma 4, Theorem 2 and Corollary 3.

Conclusions
The least favorable probability distributions of the log likelihood ratio statistic are constructed in the paper for the distortions in the L 1 -metric.The obtained results are useful for evaluation of the difference between hypothetical and actual error probabilities under functional distortions in observation distributions, adjusted in the mentioned metric.