Bootstrap Statistical Inference for the Variance Based on Fuzzy Data

The bootstrap is a simple and straightforward method for calculating approximated biases, standard deviations, confidence intervals, testing statistical hypotheses, and so forth, in almost any nonparametric estimation problem. In this paper we describe a bootstrap method for variance that is designed directly for hypothesis testing in case of fuzzy data based on Yao-Wu signed distance. Zusammenfassung: Der Bootstrap ist eine einfache und geradlinige Meth-ode um in fast jedem nichtparametrischen Schätzproblem geschätzte Biases, Standardabweichungen, Konfidenzintervalle zu berechnen, wie auch statis-tische Hypothesen zu testen und so weiter. In diesem Aufsatz beschreiben wir eine Bootstrapmethode für die Varianz, welche unmittelbar für Hypothe-sentests im Falle von unscharfen Daten basierend auf Yao-Wu vorzeichenbe-hafteter Distanzen ausgelegt ist.


Introduction
Statistical analysis in traditional form is based on crispness of data, random variables, point estimations, hypotheses, and so on.There are many different situations in which such concepts are imprecise.On the other hand, the theory of fuzzy sets is a well known tool for the formulation and the analysis of imprecise and subjective concepts.Therefore, confidence intervals and testing hypotheses with fuzzy data can be important.Methods for statistical inference (confidence intervals and hypothesis tests) in fuzzy environments are developed in different approaches.Filzmoser and Viertl (2004) present a test based on fuzzy values by introducing the fuzzy p-value.Torabi, Behboodian, and Taheri (2006) try to develop a new approach for testing fuzzy hypotheses when the available data are fuzzy, too.They state and prove a generalized Neyman-Pearson Lemma for such problems.Some methods of statistical inference with fuzzy data are reviewed by Viertl (2006).Buckley (2005Buckley ( , 2006) ) studies the problems of statistical inference in the fuzzy environment.Thompson and Geyer (2007) proposed the fuzzy p-value in latent variable problems.Taheri and Arefi (2008) exhibit an approach to test fuzzy hypotheses based on fuzzy test statistics.
The bootstrap using fuzzy data is developed in different approaches.Montenegro, Colubi, Casals, and Gil (2004) present asymptotic one-sample procedures.The asymptotic development of Körner (2000) concerns general fuzzy random variables (taking values in the space of compact convex fuzzy sets of a finite-dimensional Euclidean space).In Gonzalez-Rodriguez, Montenegro, Colubi, and Gil (2006) it is shown that the one-sample method of testing the mean of a fuzzy random variable can be extended to general ones (more precisely, to those whose range is not necessarily finite and whose values are fuzzy subsets of a finite-dimensional Euclidean space).
In this paper we construct a new method for bootstrap testing hypotheses in a fuzzy environment which is completely different from those mentioned before.For this purpose we organize the matter in the following way: In Section 2 we describe some basic concepts of canonical fuzzy numbers and the Yao and Wu (2000) signed distance.In Section 3 we come up with crisp and fuzzy bootstrap confidence intervals for the variance.In Section 4 we summarize the testing of crisp and fuzzy hypotheses.

Preliminaries
In this section we study canonical fuzzy numbers and the Yao-Wu singed distance.

Canonical Fuzzy Numbers
Let X be the universal space, then a fuzzy subset x of X is defined by its membership function µ x : X → [0, 1].We denote by x α = {x : µ x (x) ≥ α} the α-cut set of x and x 0 is the closure of the set {x : µ x (x) > 0}, and (1) x is called a normal fuzzy set, if there exists a x ∈ X such that µ x (x) = 1, (2) x is called a convex fuzzy set, if µ x (λx + (1 − λ)y) ≥ min(µ x (x), µ x (y)) for all λ ∈ [0, 1], (3) the fuzzy set x is called a fuzzy number, if x is a normal convex fuzzy set and its α-cut sets are bounded ∀α = 0, (4) x is called a closed fuzzy number, if x is a fuzzy number and its membership function µ x is upper semicontinuous, (5) x is called a bounded fuzzy number, if x is a fuzzy number and its membership function µ x has compact support.
If x is a closed and bounded fuzzy number with x L α = inf{x : x ∈ x α } and x U α = sup{x : x ∈ x α } and its membership function is strictly increasing on the interval [x L α , x L 1 ] and strictly decreasing on the interval [x U 1 , x U α ] for any α ∈ [0, 1], then x is called canonical fuzzy number.
Let " " be a binary operation ⊕ or between two canonical fuzzy numbers a and b.The membership function of a b is defined by for ∈ {⊕, } and • ∈ {+, −}.
In the following let int denote a binary operation ⊕ int or int between two closed intervals

Yao-Wu Signed Distance
Now we define a signed distance between fuzzy numbers which is used later.Several ranking methods have been proposed so far by Cheng (1998), Modarres and Sadi-Nezhad (2001), and Nojavan and Ghazanfari (2006).In this paper we use another ranking system for canonical fuzzy numbers, which is very realistic and is defined by Yao and Wu (2000) as the following: Thus, we have the following way to define the rank of any two numbers on R. For each Definition 2: For each a, b (arbitrary canonical fuzzy numbers), define the signed distance of a and b as d( a, b) means the distance of a to b.
Definition 3: (Yao and Wu, 2000) For each a, b (arbitrary canonical fuzzy numbers) define the rankings ≺, , and ≈ of a and b by

Bootstrap Confidence Interval for Variances
In this section we introduce a way to get bootstrap crisp and fuzzy confidence intervals based on fuzzy data.Through the use of the bootstrap based on fuzzy observations we obtain accurate intervals without having to make use of the normal theory.This procedure estimates the χ 2 -distribution directly from the fuzzy data.Here is the bootstrap method in more detail.

Crisp Confidence Interval
Suppose that we have a canonical fuzzy random sample x = ( x 1 , . . ., x n ).We generate B bootstrap fuzzy random samples x * 1 , . . ., x * B (i.e., each x * b is a fuzzy sample of size n randomly drawn with replacement from x) and for each we compute where and d is the Yao-Wu signed distance.The γth percentile of χ 2 * b is estimated by the value Finally, the crisp bootstrap confidence interval using fuzzy data is If Bγ is not an integer, the following procedure can be used.Assuming γ ≤ 1/2, let k = [(B + 1)γ] be the largest integer less or equal (B + 1)γ.Then we define the empirical γ and 1 − γ quantiles by the kth and (B + 1 − k)th largest values of Z * b , respectively.
Example 1: Suppose that we have taken a fuzzy random sample of size n = 12 from a population and that we have observed the triangular fuzzy data of Table 1.If B = 10000, the estimates of the 5% and 95% percentiles are the 500th and 9500th largest of all χ 2 * b values.The last line of Table 2 shows the percentiles of χ 2 * b for the variance computed using 10000 bootstrap samples.
The bootstrap confidence interval (γ = 0.05 or 90%) using fuzzy data is Figure 1 shows the distribution of χ 2 * b computed using 10000 bootstrap samples.

Fuzzy Confidence Interval
We generate B bootstrap fuzzy random samples x * 1 , . . ., x * B .Then the α-cuts of the bootstrap confidence interval using fuzzy data are whenever its membership function is given by Example 2: Consider the sample in Table 1.Now the α-cuts of the bootstrap confidence interval (γ = 0.05 or 90%) using fuzzy data are For some α values we get the α-cuts as given in Table 3.

Crisp Method and Crisp Hypotheses
Based on fuzzy observations x = ( x 1 , . . ., x n ) we consider an approach to test the following hypotheses: Decision rule: We know that Π * is a crisp confidence interval, thus Example 3: Consider the sample in Table 1.Suppose we are interested in a bootstrap test for the hypotheses Since we have 729 ∈ [320.5, 1082.1],we accept H 0 .

Fuzzy Method and Crisp Hypotheses
According to the hypotheses in Subsection 4.1, we consider the problem in the following way: Decision rule: Example 4: Consider the sample in Table 1.Suppose we are interested in a bootstrap test for the hypotheses H 0 : σ = 17.9 H 1 : σ = 17.9 .

Fuzzy Method and Fuzzy Hypotheses
We define some models as fuzzy sets of real numbers for modelling the extended versions of the simple, the one-, and the two-sided ordinary (crisp) hypotheses to the fuzzy ones.
Testing statistical hypotheses is a main topic in statistical inference.Typically, a statistical hypothesis is an assertion about the probability distribution of random variables.Traditionally, all statisticians assume that the hypothesis (for which we want to provide a test) are well-defined.Sometimes, this limitation force the statistician to make decision procedures in an unrealistic manner.This is because in realistic problems, we may come across with non-precise (fuzzy) hypotheses.For example, suppose that θ is the proportion of a population with a disease.We take a random sample and study this sample in order to have some idea about θ.In crisp hypotheses testing one uses hypotheses of the form H 0 : θ = 0.2 versus H 1 : θ = 0.2 or H 0 : θ ≤ 0.2 versus H 0 : θ > 0.2, and so on.However, we sometimes like to test more realistic hypotheses.In this example, more realistic expressions about θ would be considered as small, very small, large, approximately 0.2, and so on.Therefore, a more realistic formulation of the hypotheses might be H 0 : θ is small versus H 1 : θ is not small.We call such expressions fuzzy hypotheses.
Definition 4: Let θ 0 be a real known number.A hypothesis of the form • "H: θ is approximately θ 0 " is called a fuzzy simple hypothesis.
• "H: θ is not approximately θ 0 " is called a fuzzy two-sided hypothesis.
• "H: θ is essentially smaller than θ 0 " is called a fuzzy left one-sided hypothesis.
• "H: θ is essentially larger than θ 0 " is called a fuzzy right one-sided hypothesis.
We denote the above definitions by

Assumptions: Let
• C T be the total area under H 0 , • C A be the area of the intersection between H 0 and µ Π * * , • C R be the area of the intersection between H 0 and 1 − µ Π * * .
We know that H 0 and Π * * are canonical fuzzy numbers, thus the areas C T , C A and C R are finite.
Decision rule: Taking greater membership functions 0.7 or 0.8 for θ 0 and Π * * we could reach more accurate values of C T , C A , and C R .In summary, the above procedure is an applicable tool in fuzzy statistical inferential schemes.In the end of paper, we exhibit a decision making method which for α = 1 is the same as under classical procedures.
Here, H 0 suggests that σ is approximatively 30, and H 1 suggests that σ is away from 30.
Hence, based on the ability of Maple 7 we have C A /C R = 1.97 > 1.Thus, we accept H 0 with DoA C A /C T = 0.865.Figure 5 shows the distribution of the membership function µ Π * * and fuzzy hypotheses H 0 versus H 1 .Figure 6 shows essentially the same but the plot is based on a larger membership function of 0.42.Hence, we have C A /C R = 13 > 1.Thus, we accept H 0 with DoA C A /C T = 0.914.

Conclusions
The new approach for bootstrap statistical inference for the variance based on fuzzy data has the following issues: 1.It is established upon the notion of crisp and fuzzy confidence intervals (note that, in classical testing hypotheses, there is a relationship between interval estimation and testing hypothesis).2. By introducing the concepts of DoA and DoR, it enables us to test fuzzy hypotheses in a rather natural way.
b are two closed fuzzy numbers, then a ⊕ b and a b are also closed fuzzy numbers.Furthermore, we have .5, 1082.1) .

Table 1 :
Fuzzy random sample of size n = 12 from a population

Table 2 :
Percentiles of the χ 2 7 and χ 2 11 and the bootstrap distribution of χ 2 * b