Adaptive Nonparametric Tests for the Generalized Behrens-Fisher Problem

Some adaptive test procedures are developed for the generalized Behrens-Fisher problem. The one having a deterministic approach is based on calculating a measure of symmetry from each sample and using them as a basis for choosing between the modified Wilcoxon-Mann-Whitney test (Fligner and Policello, 1981) and the modified Mood’s median test (Fligner and Rust, 1982). The other one is a probabilistic approach which also uses a combination of the modified Wilcoxon-Mann-Whitney test and the modified Mood’s median test according to an evidence of asymmetry provided by the p-value from the triples test for symmetry given in Randles, Fligner, Policello, and Wolfe (1980). This probabilistic approach is further modified by using a suitable function of the p-value from the triples test. A simulation study reveals that the modified procedure performs reasonably well in terms of power and attainment of the nominal size. Zusammenfassung: Adaptive Testprozeduren werden für das generalisierte Behrens-Fisher Problem entwickelt. Eine mit einem deterministischen Ansatz beruht auf der Berechnung von Symmetriemaßen jeder Stichprobe. Beruhend auf diese Maße wählt man nun zwischen dem modifizierten Wilcoxon-MannWhitney Test (Fligner and Policello, 1981) und dem modifizierten Mood’s Median Test (Fligner and Rust, 1982). Die andere beruht auf einem probabilistischen Ansatz der auch eine Kombination des modifizierten WilcoxonMann-Whitney Test und des modifizierten Mood’s Median Test verwendet, in Beachtung von Hinweisen der Asymmetrie durch den p-Wert des Triples Test für Symmetrie aus Randles et al. (1980). Dieser probabilistische Ansatz wird weiter modifiziert, indem eine passende Funktion des p-Wertes des Triples Test zum Einsatz kommt. Eine Simulationsstudie zeigt, dass die modifizierte Prozedur ziemlich gut funktioniert bezüglich der Macht und dem Einhalten des nominalen Niveaus.


Introduction
Consider the problem of testing the null hypothesis that the medians of two populations having continuous cumulative distribution functions (cdf's) are equal against one-or twosided alternative.If it is assumed that the populations have the same shape with common scale, the usual nonparametric procedures, such as the Wilcoxon-Mann-Whitney test (Mann and Whitney, 1947;Wilcoxon, 1945) and the Mood's median test (Mood, 1954), are exactly distribution-free under the null hypothesis of equal medians.The distribution-free property allows the formulation of an exact size α test.But, in some situations, the two populations may have different shapes even under the null hypothesis of equal medians.In such a case, it is natural to treat the problem as a generalization of the Behrens-Fisher problem.For the standard nonparametric tests the level will not be preserved for small or large sample sizes when populations have different shapes or variances, and hence nonparametric estimation of the unknown parameters involved in the null distribution of the test statistic is required to obtain a distribution-free test procedure.
The Wilcoxon-Mann-Whitney statistic, under the assumption of symmetry for the underlying distributions, is modified by Potthoff (1963) and Zaremba (1962) to obtain conservative tests, while asymptotically distribution-free tests are proposed by Zaremba (1962) and Fligner and Policello (1981) for testing the equality of two medians without making any assumptions on the shapes of the underlying populations.If both the underlying distributions are not symmetric, the various modifications of the Wilcoxon-Mann-Whitney test are no longer asymptotically distribution free and therefore may not maintain the nominal size.However, with violation of symmetry, conservative procedures are suggested by Hettmansperger (1973) and Hettmansperger and Malin (1975).An asymptotically distribution-free test based on Mood's median test is proposed by Hettmansperger and Malin (1975) by first constructing a conservative version of Mood's test and then using an estimate of the null variance of the test statistic to raise the significance level closer to the nominal level.A modification of the Kruskal-Wallis test (Kruskal, 1952;Kruskal and Wallis, 1952) is proposed by Fligner and Rust (1984), which is exactly distribution-free under the assumption of equally shaped populations and is asymptotically distribution-free when the populations are assumed to be symmetric with equal medians.Fligner and Rust (1982) put forward a modification of the Mood's median test which is exactly distribution-free when the populations have the same shape and is asymptotically distribution-free when they do not.
A robust solution to the generalized Behrens-Fisher problem is proposed by Babu and Padmanabhan (2002), which consists of bootstrapping an appropriately centered version of the Mann-Whitney statistic.Brunner and Munzel (2000) proposed a rank test where the asymptotic variance is estimated consistently by using the ranks of overall observations as well as the ranks within each sample.It is not assumed that the underlying cdf's are continuous so that data with arbitrary ties can be handled.A permutation test based on the studentized rank statistic of Brunner and Munzel (2000) is proposed by Neubert and Brunner (2007).A likelihood ratio test for this problem is suggested by Troendle (2002), where the number of parameters in the score equations is effectively reduced to one by using a recursive formula for the remaining parameters.Bandyopadhyay andDas (2004, 2005), for 0-1 scores, proposed some partially sequential test procedures using placement statistics based on a progressively censored scheme.
The purpose of the present paper is to provide adaptive test procedures for the generalized Behrens-Fisher problem and to compare them with the existing non-adaptive competitors.The first adaptive procedure (AD1) has a deterministic approach based on the idea of Hogg, Fisher, and Randles (1975).A simple measure of skewness is calculated for each sample to assess whether both the distributions are symmetric.If both the un-derlying distributions are found to be symmetric the modified Wilcoxon-Mann-Whitney test is used, otherwise the modified Mood's median test is used.The second adaptive procedure (AD2) has a probabilistic approach, which uses the p-value from the triples test for symmetry given in Randles et al. (1980).A modification of the proposed probabilistic approach is also suggested using a suitable function of the p-value from the triples test.This proposed modified probabilistic approach (AD3) has relatively high power for nearly all cases, maintaining the level of significance reasonably well.
The content of the article is arranged in the following way.Section 2 introduces our adaptive test procedures.Section 3 provides an example illustrating the proposed procedures.In Section 4 some numerical computations are presented to get an idea about the relative performance of the proposed test procedures over the various competitors.Some relevant asymptotic properties are discussed in Section 5. Finally Section 6 gives a brief concluding remark.

Statement of the Problem and the Proposed Adaptive Tests
Let X 1 , X 2 , . . ., X n 1 and Y 1 , Y 2 , . . ., Y n 2 be independent random samples corresponding to the populations with continuous cdf's F (x) and G(y), respectively.Let θ X and θ Y denote, respectively, the unique medians of the X and Y populations.The problem considered here is to test against a suitable composite alternative.For simplicity, we consider the one-sided alternative Such a testing problem, violating the assumption on equality of shapes or scales for two or more unknown continuous cdf's, can be viewed as a generalization of the very famous Behrens-Fisher problem.Note that the more restrictive assumption that G(y) = F ((y − θ Y )/τ ) has not been made.The assumption is unnecessary and would not simplify the procedures.
Before introducing the proposed adaptive tests we present a brief description of the modified Wilcoxon-Mann-Whitney statistic, suggested by Fligner and Policello (1981) for the generalized Behrens-Fisher problem.The modified procedures are still exactly distribution-free when the populations are identical and asymptotically distribution-free, under some mild conditions, when the populations have equal medians but different shapes.The modified Wilcoxon-Mann-Whitney statistic can be used to test the null hypothesis H 0 provided that the underlying populations are symmetric.Let X denote the ordered X and Y observations, respectively.Here F and G are assumed to be symmetric.Let Q i denote the rank of X (i) in the combined sample and define the placement of X i in Y j 's as and G n 2 (y) are the empirical cdf's of the X and Y samples, respectively, we note that P i = n 2 G n 2 (X (i) ).Similarly, we define the placement of Y j by S j = n 1 F n 1 (Y (j) ) for j = 1, 2, . . ., n 2 .The Wilcoxon-Mann-Whitney statistic U is defined as the number of pairs (X i , Y j ) with X i < Y j , i = 1, 2, . . ., n 1 , j = 1, 2, . . ., n 2 .A consistent estimate of var(U ) can be obtained as Note that, since the placements are functions of the ranks, the standardized statistic We next discuss in brief the modification of Mood's median test for the generalized Behrens-Fisher problem proposed by Fligner and Rust (1982).The modified Mood test is asymptotically distribution-free over the broad null hypothesis H 0 considered here without surrendering any power to the original test under the usual nonparametric assumption.Let Z (1) , Z (2) , . . ., Z (N ) denote the combined sample order statistics.Then M , the combined sample median, is defined to be Z ((N +1)/2) for N odd and 1 2 [Z (N/2) + Z (N/2+1) ] for N even.Hence, writing we get Mood's median test statistic as Under H 0 , assuming that ) is asymptotically normal with mean 0 and variance σ 2 when n 1 , n 2 → ∞, where with λ defined earlier and where [•] denotes the greatest integer function, and the sequence and The modified Mood's median test, due to Fligner and Rust (1982), is then given by the variable T = √ n 2 (T − 1 2 )/σ, where σ2 is a consistent estimate of σ 2 and is obtained by We now proceed to introduce the proposed adaptive test procedures.In this paper we consider

The Deterministic Approach
Here we propose an asymptotically distribution-free adaptive two-sample test for the generalized Behrens-Fisher problem having a deterministic approach.Hogg et al. (1975) developed their adaptive procedures only in the context of hypothesis testing in the usual nonparametric two-sample and one-sample problems.But their nice properties also extend to more general situations.Hogg's procedure, or some modification thereof, can be effectively employed in our present situation as well.Here a fairly easy classification scheme is used which merely attempts to assess whether both the underlying distributions are symmetric.We use the data itself to make such classification, and on the basis of that information select an appropriate test statistic for testing H 0 .For this purpose we use the following measure of symmetry where Ūγ (x), Mγ (x), Lγ (x) denote, respectively, the γn 1 largest middle and smallest combined order statistics corresponding to the X sample, and Ūγ (y), Mγ (y), Lγ (y) denote that of the Y sample (Randles and Wolfe, 1979, p. 389).This statistic not only has strong intuitive appeal as a measure of skewness, but there is also a good theoretical reason for considering it.Since Q(x) and Q(y) are the ratio of two linear functions of the order statistics, its convergence properties are better than some of the other measures of symmetry.Under the null hypothesis, the order statistics are the complete sufficient statistics for the common, but unknown cdf.Hence, by Basu's theorem on ancillary statistic, they are independent of every statistic which is distributionfree.In the present situation, both T and Û are asymptotically distribution-free and hence not generally independent of the selector statistics.It can be shown, however, that the difference between the nominal level α and the actual level α converges to zero as the sample size increases.Thus the adaptive procedure can be argued to be asymptotically distribution-free.Moreover such measures are location-free.
The proposed deterministic approach (AD1) will accordingly use the following classification scheme.When the data indicate that both the populations are symmetric, i.e.Q(x), Q(y) ∈ J, use the modified Wilcoxon-Mann-Whitney statistic Û , where J is an interval suitably chosen so that the overall adaptive procedure achieves good power maintaining the nominal level satisfactorily.When the data indicate that both the populations are not symmetric, i.e. at least one of Q(x) and Q(y) does not belong to the interval J, use the modified Mood's median test statistic T .Hogg et al. (1975) considered the same measure of skewness as the selector statistic for their adaptive procedure, along with a measure of tailweight, by taking γ = 0.05.It should be noted that the value of the measure is 1 when the underlying distribution is symmetric.According to Hogg if the value of this statistic is greater than 2 then the right tail of the distribution seems to be longer than the left; i.e., there is an indication that the distribution is skewed to the right.If the value is less than 0.5, the sample indicates that the distribution may be skewed to the left.
In our simulation study we have considered small and moderate sample sizes like For n 2 = 15 if we take γ = 0.05 then γn 2 = 0.75 and consequently Ū0.05 and L0.05 do not seem to be meaningful enough.So in our case we take γ = 0.10.Different choices of the interval J = (c, d) are examined with c and d around 0.5 and 2, respectively.The choice of the interval should be such that the power of the proposed adaptive procedure is significantly close to that of the best test for the distribution considered while maintaining the nominal significance level.From the simulation studies J = (0.5, 2.3) is found to be the best choice in terms of the robustness and the power of the adaptive procedure.We can say that the assumption of symmetry is tenable if Q(x) and Q(y) lie between 0.5 and 2.3.

The Probabilistic Approach
In the present section we formulate an adaptive nonparametric procedure for the generalized Behrens-Fisher problem having a probabilistic or stochastic approach.Instead of prescribing a fixed level of significance we use p-value based randomized classification rule for the selection of an appropriate test statistic.A disadvantage with the deterministic adaptive procedure, based on Hogg's principle, is the discontinuous nature of the test selection method.The test selection is likely to be affected if the value of the selector statistic is near the boundary between two partitioning sets.Here a small change in the data may move the observed value of the selector statistic over the boundary.Moreover the boundaries are defined empirically.Although their partitioning may give good results, more objective and stochastic considerations may lead to a different, even better, adaptive procedures.The proposed probabilistic procedure consists in calculating some classification probabilities, based on the p-values of pretests, to decide on the appropriate test for the generalized Behrens-Fisher problem.These so called probabilistic or stochastic adaptive procedures are shown to be effective and yet computationally simple enough to appeal to the practicing statistician.
The proposed probabilistic approach is also a combination of the modified Wilcoxon-Mann-Whitney statistic and the modified Mood's median test statistic according to the evidence provided by the p-values of pretests for symmetry.For testing symmetry we make use of the well-known triples test presented by Randles et al. (1980).Before introducing the proposed probabilistic approach we present a brief review of the triples test.The null hypothesis for the triples test is that the underlying population is symmetric about θ against the alternative that it is asymmetric.Let where sgn(x) = 1, 0, −1 as x >, =, < 0. The triples test is then based on the U-statistic Reject the null hypothesis of symmetry if |V | > τ α/2 , where τ α/2 is the upper α/2 quantile of the standard normal distribution, with v being a consistent estimate of the variance of the U-statistic η.
We now introduce the proposed adaptive rule (AD2) having a probabilistic approach.Let p 1 and p 2 denote, respectively, the p-values corresponding to observed η-values for the X and Y samples, viz., η1 and η2 .These two p-values can be regarded as the amount of evidence against symmetry for each of the two underlying distributions.Thus we may consider p = min(p 1 , p 2 ) as the amount of evidence against symmetry for both the underlying distributions.So whenever p 1 and p 2 are observed, perform a Bernoullian trial with probability of success p = min(p 1 , p 2 ).If success occurs, use the Û test; otherwise, use the T test.In other words our adaptive test rule is: Reject H 0 with probability p if Û > ûα,n 1 ,n 2 and with probability (1 − p) if T > tα,n 1 ,n 2 ; or equivalently we may say: Accept H 0 with probability p if Û ≤ ûα,n 1 ,n 2 and with probability (1 − p) if T ≤ tα,n 1 ,n 2 , where ûα,n 1 ,n 2 and tα,n 1 ,n 2 are the upper α-critical values for the Û and the T tests, respectively.

The Modified Probabilistic Approach
The adaptive procedure considered here is a modification of the probabilistic approach.We know that p-value higher than 0.05 is considered to be statistically insignificant at 5% level of significance.Thus, for example, p = 0.10 will indicate that both p 1 and p 2 are greater than or equal to 0.10, i.e., both the distributions may be considered to be symmetric, and hence it is desirable to use the modified Wilcoxon-Mann-Whitney test with a higher probability.However, the proposed probabilistic approach uses the Û test with probability 0.10 and the T test with probability 0.90.Here p = α = 0.05 can be treated as complete dilemma.To overcome this drawback we consider a real-valued function k(•) satisfying the following conditions: i) k is monotone, non-decreasing, ii) k(0) = 0, iii) k(0.05) = 0.5 and iv) k(1) = 1.Then we propose our modified adaptive rule (AD3) as follows.Perform the random experiment with probability of success p * = min(p * 1 , p * 2 ), where p * 1 = k(p 1 ), p * 2 = k(p 2 ).The test rule will remain same as in the probabilistic approach with p replaced by p * .
Various choices of k satisfying the above conditions may be obtained.For example, one may consider the cdf of a beta distribution with median at α, the desired level of significance.But we must be aware of the dangers associated with the use of such adaptive procedures.Even though the final test is performed at the desired level of significance α, in the overall testing procedure the actual level may be quite different from the nominal level of significance.So the robustness of the adaptive procedures should be carefully examined.For example, we may consider the triples test which is a highly recommended test for testing whether a continuous univariate population is symmetric.We have used it in the first stage of our probabilistic approach and the overall test have been found to be robust enough.But, when we use the same statistic as the selector statistic in the deterministic approach, the robustness property is not maintained.So the choice of the selector statistic for such an adaptive procedure should be made with extra care.Here, all the proposed adaptive procedures are only asymptotically distribution-free and so the actual level of the adaptive procedures may be slightly higher than the nominal level.Thus, for the modified probabilistic approach, one should be concerned with an appropriate choice of k so that the adaptive procedure is robust and performs reasonably well in terms of power.The use of beta distribution with indices (1, 13.513406) fits our present situation.

An Example
As an example to illustrate the proposed methods we consider the data given in McNabb (2004, p. 242).A political scientist was interested in knowing the percentage of voters who supported a bill to limit increases in property taxes to no more than 5% per year.Data were collected in 28 countries out of which 15 countries were considered to be predominantly urban while 13 countries were predominantly rural.The null hypothesis for this study was that rural voters are no more likely to support the tax-limit bill than the urban voters whereas the alternative hypothesis was that rural voters are more likely to support limits on property taxes than are urban voters.The results of the telephone survey are given in Table 1.
Table 1: Percentage of voters supporting a tax-limit bill.Urban countries 22.20,19.90,42.09,26.12,41.11,46.44,63.67,44.12,44.22,44.23,60.56,33.12,51.07,43.07,43.55 Rural countries 33.30,29.89,59.76,35.22,51.98,54.66,69.09,45.24,47.93,53.22,61.12,42.90,58.43For testing H 0 : θ X = θ Y against the alternative H 1 : θ X < θ Y using the Û test, we reject H 0 in favor of H 1 if and only if Û > ûα,n 1 ,n 2 or in other words the corresponding p-value is less than the desired level of significance α.The p-value corresponding to Û test is 0.0426 and therefore we reject the null hypothesis H 0 with this data using the Û test at 5% level of significance.Using the T test with significance level α = 0.05 we reject H 0 in favor of H 1 if and only if T > tα,n 1 ,n 2 or equivalently the corresponding p-value is less than 0.05.For this data the p-value corresponding to T test comes out to be 0.0586.So on the basis of the given data we accept H 0 at 5% level of significance using the T test.
Clearly there is difference in decision between the two tests.So we may now proceed to illustrate the application of the proposed adaptive procedures.We first consider the AD1 test.For this we calculate Q(x) = 0.9275325 and Q(y) = 0.8336492.Both the observed Q(x) and Q(y) lie in the interval (0.5, 2.3).Thus based on this measure we may assume that both the underlying distributions are symmetric.Thus we use the Û test and hence we reject H 0 at 5% level of significance.
To perform the AD2 test we need to compute the p-values of the triples test for each of the two samples.The observed p 1 and p 2 values are 0.8030368 and 0.5041223 respectively.We now perform a Bernoullian trial with probability of success p = min(p 1 , p 2 ) = 0.5041223.If success occurs use the Û test, otherwise use the T test.
It can be observed that both the p-values are much higher than 0.05 so we may consider both the underlying distributions to be symmetric.Thus it is desirable to use the Û test with a very high probability.However in the previous approach we used the Û test with probability 0.5041223.To overcome this drawback we consider the AD3 test.It has already been pointed out that the cdf of the beta distribution with indices (1, 13.513406) is considered as the choice of the function k(•).So here p * 1 = 1, p * 2 = 0.9999235 and thus p * = min(p * 1 , p * 2 ) = 0.9999235.We now perform a Bernoullian trial with probability of success p * .If success occurs use the Û test, otherwise use the T test.

Relative Comparisons of the Competing Tests
In this section we present the results of a simulation study to assess the relative performance of the proposed adaptive procedures with the existing non-adaptive competitors.The distributions that are used to generate the data for this simulation study are members of the generalized lambda family of distributions discussed in Ramberg and Schmeiser (1974).This family provides a wide range of distributions that are easily generated since they are defined in terms of the inverse of the cdf where λ 1 and λ 2 are the location and scale parameters, respectively.The parameters λ 3 and λ 4 determine the shape of the distribution.The generalized lambda distributions are used because the skewness (α 3 = E(X−µ) 3 /σ 3 ) and kurtosis (α 4 = E(X−µ) 4 /σ 4 ) could be specified.The distributions used in this study are so chosen so that their skewness and kurtosis would cover a wide range of possibilities for the underlying distribution.The parameters defining the 10 selected distributions along with the associated skewness and kurtosis values are given in Table 2.The first 5 distributions considered are symmetric while the remaining 5 distributions are skewed.We consider the three proposed adaptive procedures AD1, AD2 and AD3, and the other tests included in the study are the two component tests of the adaptive rule, i.e., the T test and the Û test.The whole computation is carried out taking the nominal level of significance to be α = 0.05.The entries given in Tables 3-6 are generated via computer simulations with 5,000 replications for each configuration.The empirical size and the power of the tests are computed as the relative frequency with which a particular test rejects the null hypothesis H 0 .We investigate the powers at θ Y = ξ 0.5 , ξ 0.6 , ξ 0.7 , where ξ q is the qth quantile of the distribution of Y .The results presented in Tables 3 and 5 are computed assuming equal scale for both the underlying distributions, i.e., the simulations are carried out using two generalized lambda distributions having equal scale and shape parameters.Whereas those in Tables 4 and 6 are computed without the assumption of equality of the scale parameters of the two underlying distributions.Here the simulations are performed using two generalized lambda distributions keeping λ 3 , λ 4 fixed but varying λ 2 .In fact in our study the scale parameter of the second population is taken to be twice that of the first population.From the results of the simulation study, we may conclude that all the three proposed adaptive test procedures AD1, AD2 and AD3 are robust for nearly all cases.The proposed modified probabilistic approach AD3 although tends to be slightly anti-conservative in a few situations but it is reasonably better than the other two proposed adaptive procedures in terms of the total error.On the other hand, the Û test does not hold its nominal level very well when both the underlying distributions are not symmetric.Thus it is not justifiable to include it in any power comparisons as the high powers might easily be attributed to the inflated levels.However, all the proposed adaptive procedures perform better than the T test in all the situations except for the distribution 5 (the lambda approximation to the Cauchy distribution), where the T test is the best test.The proposed deterministic approach AD1 and the modified probabilistic approach AD3 have the nearest power to the Û test which is the best test when both the underlying distributions are symmetric except for the situation already mentioned.The proposed probabilistic approach AD2 achieves the desired level of significance more closely than the modified approach but as expected the modified AD3 test is much more powerful than the AD2 test.In a number of situations the proposed AD3 test is even more powerful than the AD1 test.Thus, overall we may say that there is not much difference between the AD1 test and the AD3 test in terms of power, and hence these two tests seems to be more preferable compared to the other existing competitors for the generalized Behrens-Fisher problem.

Some Asymptotics
We now consider some asymptotic properties of the proposed adaptive test statistics.As the asymptotics for all the three proposed procedures are same, we consider here the test based on AD2 only.We know that the modified Mood's median test statistic is asymptotically standard normal.Again if the underlying distribution is symmetric then we know that the null distribution of the modified Wilcoxon-Mann-Whitney statistic is asymptotically standard normal.Also note that the magnitude of the p-values of the triples tests involved in the proposed test is an indicator of the amount of asymmetry for each of the underlying distribution present in the two samples.In order to study the asymptotic properties of the proposed adaptive procedures we need to look into the asymptotic behavior of the p-values of the preliminary tests.The p-value for the triples test is defined as , where v is the observed value of V .Since the triples test statistic V is asymptotically standard normal, we can approximate the p-value by 2(1 − Φ(v)), where Φ(v) is the standardized normal cdf The p-value is itself a random variable whose distribution, for the null hypothesis, is asymptotically uniform over (0, 1).Moreover, for the alternative hypothesis, the p-value goes to zero with probability one as the sample size becomes large.We know that, in general, convergence in distribution does not imply convergence of the corresponding expected value.However, for every uniformly bounded continuous function g, convergence in distribution of the sequence of random variables {W n } to the random variable W does   )].Thus, as the sample size increases, the expected p-value here approaches 1 and 0 under the null and the alternative hypotheses, respectively.Hereinafter, whenever we discuss about the limiting p-values we indeed refer to the limit of the expected p-values.If the data do not provide sufficient evidence of asymmetry for at least one of the samples, then p = min(p 1 , p 2 ) approaches 1, otherwise it tends to 0. Keeping the above results in mind we now study the asymptotic behavior of where I(x) is an indicator function assuming the values 1 or 0 according as x is true or false and U * is uniformly distributed over (0, 1) and is independent of At the first stage a normal QQ (Quantile-Quantile) plot is used to check the asymptotic normality of AD2 under H 0 .The normal QQ plot is a very useful visual tool for assessing whether the distribution of a given variable follows a normal distribution.The QQ plot plots the empirical quantiles against the theoretical quantiles for normal distribution.When the distribution of the variable under examination has the same shape as the reference distribution, the normal distribution in this case, the QQ plot is linear.We have generated QQ plots using some of the 10 selected members of the generalized lambda family and present here the normal QQ plots for the distribution 3, which is symmetric and has a moderate tailweight, along with the distribution 8, which is positively skewed with heavier tailweight.All the normal QQ plots presented here are generated using two distributions with unequal scale parameters.The normal QQ plots seems to be fairly ψ N (τ ) = Φ(τ ) , without any assumption regarding the symmetry of the two underlying distributions.
The asymptotic power properties of the adaptive test procedures depend on the criteria and the statistic used.We focus our attention on the case where F and G belong to the same location and scale family.Let X 1 , X 2 , . . ., X n 1 and Y 1 , Y 2 , . . ., Y n 2 be independent random samples corresponding to the populations with continuous cdf's F (x) = H(x − θ X ) and G(y) = H{(y − θ Y )/τ } respectively, where H(•) is an arbitrary continuous cdf with H(0) = 1/2.Suppose H(x) has the density h(x) at all real x.Then under the sequence of Pitman local alternatives and suitable regularity conditions, the asymptotic power of the modified Mood's test is given by Φ(e 1 b − τ α ) and that of the modified Wilcoxon-Mann-Whitney test is given by Φ(e 2 b − τ α ), where We see that power of the adaptive test converges to the power of the better component.Thus under (7) the power of each of the proposed adaptive test procedures converges to the power of the Û test when both the underlying distributions are symmetric, and to that of the T test otherwise.This is shown in the following result.
Result 2. The asymptotic power of the adaptive test AD2 under the sequence of local alternatives ( 7) is Proof.Using the same technique as in Result 1, the asymptotic power of the upper α level AD2 test under (7) can be obtained as

Concluding Remarks
In the nonparametric two-sample location problem, the most common statistical problem is the testing of the null hypothesis that the two populations have equal medians against one-or two-sided alternatives.Here the only assumption is that the two populations are continuous.In this paper we have developed some adaptive procedures for testing the so called generalized Behrens-Fisher problem.The deterministic approach (AD1) proposed in this paper is based on calculating a simple measure of symmetry for each sample involving the terms Ūγ (x), Mγ (x), Lγ (x) and Ūγ (y), Mγ (y), Lγ (y) which denote the averages of the γn 1 and γn 2 largest, middle and smallest combined order statistics for the X and Y samples, respectively.If each of these measures falls in a specific interval then both the underlying distributions are considered to be symmetric.A disadvantage with this deterministic approach is that both the choice of γ and that of the interval are made subjectively.
In the proposed probabilistic approach we make use of the triples test which is asymptotically distribution-free for testing whether the distribution is symmetric about some unknown location parameter.We want to compare between our proposed p-value based randomized classification rule and the deterministic rule based on the concept of Hogg.So in the proposed deterministic approach we use the same selector statistic as in Hogg et al. (1975) to assess the skewness of the underlying distribution.
The simulation study demonstrates that the proposed probabilistic approach AD2 is robust for nearly all the situations but the proposed deterministic approach AD1 and the modified probabilistic approach AD3, being reasonably robust, have relatively much higher powers.We have already pointed out the subjectivity in the proposed AD1 test and we recommend to use the AD3 test for the two-sample generalized Behrens-Fisher problem.The result seems to be important for statistical applications, because a practicing statistician has usually little or no information about the underlying distribution of the data.Adaptive tests are not designed to be optimal for any particular distribution but this study convinces us that they are certainly worth considering in practical problems.Note that, although to perform the simulation study and obtain the expression for the asymptotic power of the proposed adaptive test it is assumed that F and G belong to the same location and scale family they are also valid when F and G do not belong to the same location and scale family.It should also be noted that we can construct adaptive tests for the multisample generalized Behrens-Fisher problem.
In the modified probabilistic approach AD3 it is argued that the random experiment should be done with probability p * = min(p * 1 , p * 2 ), where p * 1 = k(p 1 ), p * 2 = k(p 2 ) and k(•) is such that i) k is monotone, non-decreasing, ii) k(0) = 0, iii) k(0.05) = 0.5 and iv) k(1) = 1.For example we can take k(•) as the cdf of a suitable beta distribution with median at α.The median of beta(β 1 , β 2 ) is taken as α.We fix β 1 , and find β 2 .For different choices of β 1 there will be different β 2 for the fixed α.Thus we obtain different possible choices of p * for different choices of β 1 .Here we have used one such choice of β 1 and β 2 .However it may be possible to find an optimal β 1 using some criterion.We defer this to future research.

Table 2 :
Distributions used in the simulation study.

Table 3 :
Empirical size and power of the tests for n 1 = n 2 = 20 (equal scale).

Table 4 :
Empirical size and power of the tests for n 1 = n 2 = 20 (unequal scale).

Table 5 :
Empirical size and power of the tests for n 1 = 25, n 2 = 15 (equal scale).

Table 6 :
Empirical size and power of the tests for n 1 = 25, n 2 = 15 (unequal scale).the combined sample size N = 40 with n 1 = n 2 = 20 and n 1 = 25, n 2 = 15.It is also observed that as we increase the combined sample size to N = 60 with n 1 = n 2 = 30 and n 1 = 35, n 2 = 25 the normal QQ plots tends to be much more linear.Thus the normal QQ plots provide us with a fair indication about the asymptotic normality of AD2 under H 0 .We now proceed to verify the asymptotic normality of AD2 under H 0 theoretically in the following result: Proof.Let ψ N (τ ) denote the cdf corresponding to AD2 under H 0 .Further we denote the events [ Û ≤ τ ], [ T ≤ τ ] and [U * ≤ p] by A N , B N and E N , respectively.Then we can