Sequential Probability Ratio Test for Fuzzy Hypotheses Testing with Vague Data

In hypotheses testing, such as other statistical problems, we may confront imprecise concepts. One case is a situation in which both hypotheses and observations are imprecise. In this paper, we redefine some concepts about fuzzy hypotheses testing, and then we give the sequential probability ratio test for fuzzy hypotheses testing with fuzzy observations. Finally, we give some applied examples.


Introduction
Fuzzy set theory is a powerful and known tool for formulation and analysis of imprecise and subjective situations where exact analysis is either difficult or impossible.Some methods in descriptive statistics with vague data and some aspects of statistical inference is proposed in Kruse and Meyer (1987).Fuzzy random variables were introduced by Kwakernaak (1978), or Puri and Ralescu (1986) as a generalization of compact random sets, Kruse and Meyer (1987) and were developed by others such as Juninig and Wang (1989), Ralescu (1995), López-Díaz and Gil (1997), López-Díaz andGil (1998), andLiu (2004).
In this paper, because of our main purpose (statistical inference about a parametric population with fuzzy data), we only consider and discuss fuzzy random variables associated with an ordinary random variable.
Decision making in classical statistical inference is based on crispness of data, random variables, exact hypotheses, decision rules and so on.As there are many different situations in which the above assumptions are rather unrealistic, there have been some attempts to analyze these situations with fuzzy set theory proposed by Zadeh (1965).
One of the primary purpose of statistical inference is to test hypotheses.In the traditional approach to hypotheses testing all the concepts are precise and well defined (see, e.g., Lehmann, 1994, Casella and Berger, 2002, and Shao, 1999).However, if we introduce vagueness into hypotheses, we face quite new and interesting problems.Arnold

Preliminaries
Let (Ω, F, P) be a probability space.A random variable (RV) X is a measurable function from (Ω, F, P) to (R, B, P X ), where P X is the probability measure induced by X and is called the distribution of the RV X, i.e., Using "the change of variable rule", (see e.g.Billingsley, 1995, p. 215 and216, or Shao, 1999, p. 13), we have If P X is dominated by a σ-finite measure ν, i.e., P X < < ν, then using the Radon-Nikodym theorem, (see e.g.Billingsley, 1995, p. 422 and423, or Shao, 1999, p. 14), we have where f (x) is the Radon-Nikodym derivative of P X with respect to ν and is called the probability density function (PDF) of X with respect to ν.
In statistical texts, the measure ν is usually a "counting measure" or a "Lebesgue measure"; hence P The set X is usually called "support" or "sample space" of X.A random vector X = (X 1 , . . ., X n ) is said a random sample of size n from a population with PDF f (x), if the X i 's are independent distributed all with PDF f (x) (X i 's are identically distributed).In this case, we have In the following we present two definitions from the introduction of Casals et al. (1986), but in a slightly different way.
Definition 1 A fuzzy sample space X is a fuzzy partition (Ruspini partition) of X , i.e., a set of fuzzy subsets of X whose membership functions are Borel measurable and satisfy the orthogonality constraint: x∈ X µ x(x) = 1, for each x ∈ X .
Definition 2 A fuzzy random sample (FRS) of size n X = ( X1 , . . ., Xn ) associated with the PDF f (x) is a measurable function from Ω to X n , whose PDF is given by The density f (x) is often called the fuzzy probability density function of X.
The above definition is according to Zadeh (1968).Note that using Fubini's theorem (see Billingsley, 1995, p. 233-234), we obtain independency of the Xi 's, i.e., and f (x i ) is the PDF of the fuzzy random variable (FRV) Xi , for each i = 1, . . ., n.For each i, f (x i ) really is a PDF on X , because by the orthogonality of the µ xi 's, we have is an ordinary random variable.
Proof: X is a measurable function from Ω to X n and g is a measurable function from X n to R. Hence, g( X(ω)) = go X(ω) is a composition of two measurable functions, therefore is measurable from Ω to R (see Billingsley, 1995, p. 182).
Note that using Theorem 1, we can define and use all related concepts for ordinary random variables, such as expectation, variance, etc.
Theorem 2 Let X be a fuzzy random sample with fuzzy sample space X n , and g be a measurable function from X n to R. The expectation of g( X) is calculated by Austrian Journal of Statistics, Vol. 34 (2005), No. 1, 25-38 Proof: Using the change of variable rule and the Radon-Nikodym theorem, we have For more details about properties of ordinary RV's and their moments see, e.g., Ash and Doleans-Dade (2000), Billingsley (1995), Chung (2000), Feller (1968), or Ross (2002).
In this paper, we suppose that the PDF of the population is known but it has an unknown parameter θ ∈ Θ.In this case, we index f by θ and write f (x; θ).
Example 1 Let X be a Bernoulli variable with parameter θ, i.e., We have X = {0, 1}.Let x1 and x2 be two fuzzy subsets of X with membership functions Note that x1 and x2 are stated "approximately zero" and "approximately one" values, respectively.Here, the support of Note that Y is a measurable function from X to R and therefore is a classical random variable.In the following, we calculate the mean and the variance of Y .The PDF of Y is f Y (y; θ) = 0.9 − 0.8θ , y = 0.1 0.1 + 0.8θ , y = 0.9 .

Fuzzy Hypotheses Testing
In this section we introduce concepts about fuzzy hypotheses testing (FHT).
Definition 3 Any hypothesis of the form "H : θ is H(θ)" is called a fuzzy hypothesis, where "H : θ is H(θ)" implies that θ is in a fuzzy set of Θ, the parameter space, with membership function H(θ), i.e., a function from Θ to [0, 1].
Note that the ordinary hypothesis H : θ ∈ Θ is a fuzzy hypothesis with membership function H(θ) = 1 at θ ∈ Θ, and zero otherwise, i.e., the indicator function of the crisp set Θ.
Example 2 Let θ be the parameter of a Bernoulli distribution.Consider the following function: The hypothesis "H : θ is H(θ)" is a fuzzy hypothesis and it means that "θ is approximately 1/2".
In FHT with fuzzy data, the main problem is testing according to a fuzzy random sample X = ( X1 , . . ., Xn ) from a parametric fuzzy population with PDF f (x; θ).In the following we give some definitions in FHT theory with fuzzy data.

Definition 4
The normalized membership function of H j (θ) is defined by Replace integration by summation in discrete cases.Note that the normalized membership function is not necessarily a membership function, i.e., it may be greater than 1 for some values of θ.
In FTH with fuzzy data, like in traditional hypotheses testing, we must give a test function Φ( X), which is defined in the following.
Definition 5 Let X be a FRS with PDF f (x; θ).Φ( X) is called a fuzzy test function, if it is the probability of rejecting H 0 provided X = x is observed.Definition 6 Let the FRV X have PDF f (x; θ).Under H j (θ), j = 0, 1, the weighted probability density function (WPDF) of X is defined by Remark 1 fj (x) is a PDF, since fj (x) is nonnegative and Hence, fj (x 1 , . . ., xn ) is also a joint PDF.
Definition 7 Let Φ( X) be a fuzzy test function.The probability of type I and II errors of Φ( X) for the fuzzy testing problem (1) is defined by respectively, where E j [ Φ( X)] is the expected value of Φ( X) over the joint WPDF fj (x), j = 0, 1.
Note that in the case of testing a simple crisp hypothesis against simple crisp alternative, i.e., and for crisp observations, the above definition of α Φ and β Φ gives the classical probability of errors.
Regarding to definitions of error sizes, it is concluded that fuzzy hypotheses testing (1) is equivalent to the following ordinary hypotheses testing Definition 8 A fuzzy testing problem with a test function Φ is said to be a test of (significance) level α, if α Φ ≤ α, where α ∈ [0, 1].We call α Φ the size of Φ.

Sequential Probability Ratio Test for FHT
In this section, first, we define the sequential probability ratio test(SPRT) for the ordinary simple hypotheses testing with crisp observations and then concerning Section 3, we extend the SPRT to the FHT with fuzzy observations.Consider testing a simple null hypothesis against a simple alternative hypothesis.In other words, suppose a sample can be drawn from one of two known distributions and it is desired to test that the sample came from one distribution against the possibility that it came from the other.If X 1 , X 2 , • • • denote the iid RV's, we want to test H 0 : . .. For fixed k 0 and k 1 satisfying 0 < k 0 < k 1 , adopt the following procedure: Take observation x 1 and compute R The idea is to continue sampling as long as k 0 < R j < k 1 and stop as soon as The critical region of the described sequential test can be defined as C = ∞ n=1 C n , where Similarly, the acceptance region can be defined as A = ∞ n=1 A n , where Definition 9 For fixed k 0 and k 1 , a test as described above is defined to be a sequential probability ratio test (SPRT).Therefore for the SPRT, the probability of type I and II errors is calculated by α = ∞ n=1 C n L 0 (x) dx, and β = ∞ n=1 A n L 1 (x) dx, respectively.In the following, we briefly state some results about the classical SPRT without proofs.For more details see Mood et al. (1974) or Hogg and Craig (1995).
Let k 0 and k 1 be defined so that the SPRT has fixed probabilities of type I and II errors α and β.Then k 0 and k 1 can be approximated by k 0 = α/(1 − β) and k 1 = (1 − α)/β, respectively.If α and β are the error sizes of the SPRT defined by k 0 and k 1 , then α + β ≤ α + β.
If z i = log(f 0 (x i )/f 1 (x i )), an equivalent test to the SPRT is given by the following: continue sampling as long as log(k 0 ) < m i=1 z i < log(k 1 ), and stop sampling when Using Wald's equation we obtain , where ρ = P (reject H 0 ).Hence, Now, we are ready to state the SPRT for fuzzy hypotheses testing with vague data.
Definition 10 Let X1 , X2 , . . .be an iid sequence of FRV's from a population with PDF f (•; θ).We propose to consider testing as a SPRT for fuzzy hypotheses testing (1), in which fj (x) is the WPDF of f (x; θ) under H * j (θ), j = 0, 1 (see Definition 6).Thus, the critical region of the described SPRT for fuzzy hypotheses testing (1) is defined as C = ∞ n=1 C n , where Similarly, the acceptance region can be defined as A = ∞ n=1 A n , where Regarding to the definition of WPDF, α, β and other related concepts, all results of the ordinary SPRT are satisfied for this case, of course with the following modifications.In this case, we have Hence, Note that Z i is an ordinary RV.
Example 3 Let X 1 , X 2 , . . .be a sequence of iid RV's from Bernoulli(θ), 0 < θ < 1.We want to test where according to two fuzzy data (fuzzy subsets of X = {0, 1}) xI and xII where their membership functions are defined by The normalized membership function of H j (θ) is If we denote this FRV, its fuzzy observation and its PDF by X, x, and f (x; θ), respectively, then using Example 1, we have It is easy to show that Hence, for i = 1, 2, . .., we obtain Example 4 Let X 1 , X 2 , . . .be a sequence of iid RV's from a N(µ, σ 2 ) population, i.e.
We want to test with membership functions using the SPRT, according to three fuzzy data (fuzzy subsets of X = (−∞, +∞)) xI , xII , and xIII , where their membership functions are defined by The fuzzy subsets xI , xII , and xIII can be interpreted as the values of "very small", "near to zero", and "very large".
Note that the µ's are measurable and satisfy the orthogonality constraint (see Definitions 1 and 2).
The normalized membership function of H j (θ) is Denote this FRV, its fuzzy observation and its PDF by X, x, and f (x; θ), respectively.Let We want to test where the membership functions H 0 (θ) and H 1 (θ) are defined by using the SPRT, according to three fuzzy data, fuzzy subsets of X = (0, +∞), xI , xII , and xIII , where their membership functions are defined by We can interpret the fuzzy subsets xI , xII , and xIII as the values of "near to zero", "near to 3/2", and "very large".Note that µ's are measurable and satisfy the orthogonality constraint.
The normalized membership function of H j (θ) is Denote this FRV, its fuzzy observation and its PDF by X, x, and f (x; θ), respectively.It can be shown that Thus, for i = 1, 2, . .., we obtain .95, and we therefore take n = 9, whereas E[N |H 1 true] = 5.95 and we take n = 6.
We can interpret the fuzzy subsets xI , xII , and xIII as the values of "near to zero", "near to 0.5", and "near to 1".It is clear that all µ's are measurable and satisfy the orthogonality constraint of Definition 1.
If we denote this FRV, its fuzzy observation and its PDF by X, x, and f (x; θ), respectively, then using Definition 2, we have f (x; θ) = m i=1 z i ≤ log(k 0 ) (and reject H 0 ) or m i=1 z i ≥ log(k 1 ) (and accept H 0 ).Let N be the RV denoting the sample size of the SPRT.The SPRT with error sizes α and β minimizes both E[N |H 0 true ] and E[N |H 1 true ] among all tests (sequential or not) which satisfy P (H 0 rejected |H 0 true) ≤ α , and P (H 0 accepted |H 0 false) ≤ β .