Decomposition of Measure for Marginal Homogeneity in Square Contingency Tables with Ordered Categories

Abstract: For the analysis of square contingency tables with ordered categories, Tomizawa et al. (2003) considered a measure to represent the degree of departure from marginal homogeneity (MH). Tomizawa (1993) considered an extended marginal homogeneity (EMH) model. This paper (i) proposes a measure to represent the degree of departure from EMH, (ii) proposes a measure from equality of marginal means (E), and (iii) gives a theorem that the value of measure for MH is equal to the sum of the value of measure for EMH and that for E.


Introduction
Consider an r × r square contingency table with the same row and column classifications.Let p ij denote the probability that an observation will fall in the ith row and jth column of the table (i = 1, . . . , r, j = 1, . . . , r).
Consider the marginal homogeneity (MH) model defined by where see, e.g., Stuart (1955) and Bishop, Fienberg, and Holland (1975, p.282).Let for i = 1, . . ., r − 1.By considering the difference between the cumulative marginal probabilities, the MH model may be expressed as This states that the cumulative probability that an observation will fall in row category i or below and column category i + 1 or above is equal to the cumulative probability that the observation falls in column category i or below and row category i + 1 or above.Consider the extended marginal homogeneity (EMH) model (Tomizawa, 1993) defined by G 1(i) = δG 2(i) for i = 1, . . ., r − 1 .
Let X and Y denote the row and column variables, respectively.Consider the model of equality of marginal means (E) defined by Tomizawa (1991) pointed out that the MH model holds if and only if both the EMH and the E models hold.Tomizawa (1995) and Tomizawa and Makii (2001) considered the measures to represent the degree of departure from MH for the data on a nominal scale, and Tomizawa et al. (2003) considered them for the data on an ordinal scale; see Appendix for the Kullback-Leibler (KL) information type measure Γ MH proposed in Tomizawa et al. (2003).
When we want to see the degree of departure from EMH, we cannot use the measure Γ MH because Γ MH can measure the degree of departure from MH, however it cannot measure it from EMH.Therefore, for the data on an ordinal scale, we are interested in a measure to represent what degree the departure from EMH is.
The purpose of this paper is (i) to propose a measure which represents the degree of departure from EMH (denoted by Γ EMH ), (ii) to propose that from E (denoted by Γ E ), and (iii) to give the theorem that the value of Γ MH is equal to the sum of the value of Γ EMH and the value of Γ E .We emphasize that the measure Γ EMH proposed in this paper is entirely different from the measures which represent the degree of departure from MH in Tomizawa (1995), Tomizawa andMakii (2001), andTomizawa et al. (2003).

Measure for Extended Marginal Homogeneity
We shall consider the measure to represent the degree of departure from EMH.Let Assuming that ∆ * U > 0, ∆ * L > 0, and {G 1(i) + G 2(i) > 0}, consider a measure defined by where Note that According to the KL information, Γ EMH represents the degree of departure from EMH, and the degree increases as the value of Γ EMH increases.

Measure for the Equality of Marginal Means
We shall consider the measure to represent the degree of departure from the E model.We note that E , although the details are omitted here.Assuming that ∆ * U ≥ 0 and ∆ * L ≥ 0, consider a measure defined by where This may be expressed as ), and therefore Γ E must lie between 0 and 1.We also see that (i) Γ E = 0 if and only if there is a structure of E in the r × r table, and (ii) Γ E = 1 if and only if the degree of departure from E is the largest in a sense that ∆ * U = 0 (then According to the KL information or the Shannon entropy, Γ E represents the degree of departure from E model, and the degree increases as the value of Γ E increases.

Relationships between the Measures
Then we obtain the following theorem.
Theorem 1.The value of Γ MH equals the sum of the value of Γ EMH and the value of Γ E .Proof.It is easily seen that the right term of equation ( 1) plus the right term of equation ( 2) equals the right term of equation ( 3) in the Appendix.Thus, the proof is completed.
From Theorem 1, Γ EMH is expressed as Γ EMH = Γ MH − Γ E .Therefore, the measure Γ EMH also would indicate the degree of departure from MH excluding the influence of degree of departure from E.
From (1) we see that Γ EMH ≥ 0. Thus we obtain the next theorem.
Theorem 2. The value of Γ MH is greater than or equal to the value of Γ E .The equality holds if and only if there is a structure of EM H in the r × r table.
We also see that (ii) ).It seems appropriate to consider that then the degree of departure from EMH is largest.

Approximate Confidence Intervals for the Measures
Let n ij denote the observed frequency in the ith row and jth column of the r × r square table (i = 1, . . ., r, j = 1, . . ., r).Assuming that the {n ij } result from a full multinomial sampling, we consider the approximate standard errors and large-sample confidence intervals for Γ EMH and Γ E using the delta method as described by Bishop et al. (1975, Sec.14.6) and Agresti (1990, Sec.12.1).The sample version of Γ EMH , i.e., ΓEMH , is given by Γ EMH with {p ij } replaced by {p ij }, where pij = n ij /n and n = n ij .Similarly, ΓE is given.Using the delta method, √ n( ΓEMH − Γ EMH ) has asymptotically (as n → ∞) a normal distribution with mean zero and variance, where  2) clerical (3) sales (4) skilled manual (5) semiskilled manual (6) unskilled manual (7) farmers (8)

Similarly,
√ n( ΓE − Γ E ) has asymptotically a normal distribution with mean zero and variance where Let var(Γ) denote var(Γ) with {p ij } replaced by {p ij }.Then var 1/2 [Γ]/ √ n is the estimated approximate standard error for Γ, giving a confidence interval for Γ.

Examples
Example 1.The data in Table 1 taken from Tominaga (1979, p.131) describe the crossclassification of father's and son's occupational status categories in Japan which were examined in 1955 and 1975.Since the confidence interval for Γ EMH applied to each of Tables 1a and 1b does not include zero (see Table 2), this would indicate that there is not a structure of EMH in each table.Let G 2 denote the likelihood ratio chi-squared statistic for testing goodness-of-fit of the model.The values of G 2 for the EMH model are 116.76 for Table 1a and 280.73 for Table 1b with r − 2 = 6 degrees of freedom (df).Therefore the EMH model fits each of these data poorly.
We compare the degree of departure from EMH between these tables using ΓEMH .We can see from ΓEMH that (i) for Table 1a, the degree of departure from EMH is estimated to be 0.023 times the maximum degree of departure from EMH, and (ii) for Table 1b, it is estimated to be 0.055 times the maximum degree of departure from EMH.
When the degrees of departure from EMH in Tables 1a and 1b are compared using the confidence interval for Γ EMH , it would be greater in Table 1b than in Table 1a.Example 2. The data in Table 3 taken from Tallis (1962) describe the cross-classification of Merino ewes according to the number of lambs born in consecutive years, 1952 and 1953 (also see Bishop et al., 1975, p.288).
Since the confidence intervals for Γ MH and Γ EMH do not include zero (see Table 4), these would indicate that there is not each structure of MH and EMH in the table.How-ever, since the confidence interval for Γ E includes zero (see Table 4), this would indicate that there is a structure of E in the table.Also, the degree of departure from EMH is larger than the degree of departure from E. Therefore we can state from Theorem 1 that the lack of structure of the MH model is caused by the lack of structure of the EMH model rather than that of the E model.Example 3. Table 5 taken directly from Breslow and Day (1980, p.185) is the data from the Los Angeles study of endometrial cancer.These data are obtained from the 59 matched pairs using four dose levels of conjugated oestrogen, (1) none, (2) 0.1-0.299mg,(3) 0.3-0.625mg,and (4) 0.626+mg.Since the confidence intervals for Γ MH and Γ E do not include zero (see Table 6), these would indicate that there is not each structure of MH and E in the table.While, since the confidence interval for Γ EMH includes zero (see Table 6), this would indicate that there is a structure of EMH in the table.In addition, the degree of departure from E is larger than the degree of departure from EMH. Therefore we can state from Theorem 1 that the lack of structure of the MH model is caused by the lack of structure of the E model rather than that of the EMH model.This is contrast to the case of Example 2.

Concluding Remarks
The measures ΓEMH and ΓE always range between 0 and 1 independent of the dimension r and sample size n.So, ΓEMH and ΓE may be useful for comparing the degree of departure from EMH and E, respectively, in several tables.
Consider the artificial data in Table 7. From the values of G 2 (with 2 df) for the EMH model (see Table 7d), we see that the EMH model fits the data in Table 7a worse than the data in Table 7b.In contrast, the value of ΓEMH is less for Table 7a than for Table 7b (see Table 7d).In terms of { Ĝ1(i) / Ĝ2(i) }, i = 1, 2, 3 (see Table 7d), it seems natural to conclude that the degree of departure from EMH is less for Table 7a than for Table 7b.Therefore ΓEMH may be preferable to G 2 for comparing the degree of departure from EMH in several tables.(By the similar reason, ΓEMH may also be preferable to the P -values for comparing them.)It may seem, to many readers, that G 2 /n is also a reasonable measure for representing the degree of departure from EMH.However, it does not seem to us that G 2 /n is a reasonable measure.For example, consider the artificial data in Tables 7b and 7c.The values of G 2 /n are 0.05 for Table 7b, and 0.08 for Table 7c.Therefore the value of G 2 /n is less for Table 7b than for Table 7c.However, the value of ΓEMH for Table 7b is theoretically identical to that for Table 7c 7b is identical to those for Table 7c (see Table 7d).It seems natural to conclude that the degree of departure from EMH for Table 7b is equal to that for Table 7c.Therefore ΓEMH may also be preferable to G 2 /n for comparing the degree of departure from EMH in several tables.
The ΓEMH would be useful when we want to see what degree the departure from EMH is toward the maximum departure from EMH.Note that we cannot use the G 2 and ΓMH when we want to see it.
We observe that (i) the measure Γ EMH should be applied to contingency tables with ordered categories, (ii) the asymptotic normal distribution of √ n( ΓEMH − Γ EMH ) may be not applicable when Γ EMH = 0 and Γ EMH = 1 because then var[Γ EMH ] = 0, and (iii) ΓEMH cannot be used for testing goodness-of-fit of EMH (although G 2 can be used for it).6 Discussion Consider the artificial data in Table 8.The values of G 2 for the MH and EMH models are 389.17 with 3 df and 4.91 with 2 df, respectively.Therefore, these would indicate that for these data, there is not the structure of MH, however, there is the structure of EMH.Then the estimated value of measure Γ MH is 0.517 and the estimated value of measure Γ EMH is 0.002.Thus, when we want to see the degree of departure from EMH, the measure Γ EMH would be useful although the measure Γ MH is not useful.We emphasize that the measure Γ EMH proposed in this paper is entirely different from the measures which represent the degree of departure from MH as the measure Γ MH although the form may be similar.
Finally we note that Tomizawa et al. (2003) also gave the power-divergence type measure to represent the degree of departure from MH; however, using the power-divergence, we cannot obtain the similar result to Theorem 1.
and only if the EMH model holds, and (iii) Γ EMH = 1 if and only if the degree of departure from EMH is the largest in a sense that G 1

Table 1 :
Occupational status for Japanese father-son pairs

Table 2 :
Estimates of Γ EMH , its approximate standard error, and 95% confidence interval

Table 3 :
Cross-classification of Merino ewes according to number of lambs born in consecutive years

Table 4 :
Estimates of Γ EMH , Γ E , and Γ MH , their approximate standard errors and 95% confidence intervals, applied to Table3

Table 5 :
Average doses of conjugated oestrogen used by cases and matched controls: Los Angeles endometrial cancer study

Table 6 :
Estimates of Γ EMH , Γ E , and Γ MH , their approximate standard errors and 95% confidence intervals, applied to Table5

Table 8 :
Artificial data