Ridit Score Type Quasi-symmetry and Decomposition of Symmetry for Square Contingency Tables with Ordered Categories

For square contingency tables with the same row and column ordinal classifications, this paper proposes the quasi-symmetry model based on the marginal ridits. The model indicates that the log-odds that an observation will fall in the (i, j) cell instead of in the (j, i) cell, i < j, is proportional to the difference between the average ridit score of row and column marginal distributions for category j and that for category i. This paper also gives a theorem such that the symmetry model holds if and only if both the proposed model and the marginal mean equality model hold. Examples are given. Zusammenfassung: Für quadratische Kontingenztafeln mit gleicher ordi-naler Zeilen-und Spalten-Klassifikation empfiehlt dieser Aufsatz das Quasi-Symmetrie-Modell basierend auf den marginalen Ridits. Das Modell gibt an, dass die Log-Odds dafür, dass eine Beobachtung in Zelle (i, j) statt in Zelle (j, i), i < j, fällt, ist proportional der Differenz zwischen dem durchschnit-tlichen Ridit Score der marginalen Zeilen-und Spalten-Verteilungen für Kat-egorie j und und jener für Kategorie i. Dieser Aufsatz enthält auch einen Satz darüber, dass das Symmetrie-Modell genau dann hält wenn sowohl das vorgeschlagene Modell als auch das marginale Mittelgleichheitsmodell hal-ten. Beispiele sind gegeben.


Introduction
Consider an R×R square contingency table with the same row and column classifications.Let p ij denote the probability that an observation will fall in the ith row and jth column of the table (i, j = 1, . . ., R). Caussinus (1965) considered the quasi-symmetry (QS) model, defined by where ψ ij = ψ ji .A special case of this model with {α i = β i } is the symmetry (S) model (see, Bowker, 1948;Bishop, Fienberg, and Holland, 1975, p. 282;Tomizawa and Tahata, 2007).The marginal homogeneity (MH) model is defined by where p i• = R t=1 p it and p •i = R s=1 p si (Stuart, 1955;Bishop et al., 1975, p. 293).Also, Caussinus (1965) gave the theorem that the S model holds if and only if both the QS and MH models hold.
Note that the LDPS model implies the QS model.Let u 1 < • • • < u R denote the ordered known scores assigned for both the rows and columns of same classifications.The generalized LDPS model with u i instead of i is the ordinal quasi-symmetry (OQS) model (Agresti, 2002, p. 429).
Let X and Y denote the row and column variables, respectively.Refer to model of equality of marginal means, i.e., E(X) = E(Y ), as the ME model.Yamamoto, Iwashita, and Tomizawa (2007) gave the theorem that the S model holds if and only if both the LDPS and ME models hold (see also Tahata, Yamamoto, and Tomizawa, 2008).Let The {r X i } and {r Y i } are the marginal ridits; see Bross (1958), Fleiss, Levin, andPaik (2003, pp. 198-205), andTahata, Miyamoto, andTomizawa (2008).The ith ridit for row (column) variable is the probability that an observation falls in the row (column) category i − 1 or below plus half the probability that it falls in the row (column) category i.The ridits also may be expressed as where . ., R, are the distribution functions of X and Y , respectively.Suppose that the categories of the ordinal row and column variables represent intervals of an underlying continuous distribution.If the underlying distribution is uniform over each interval, then r X i (r Y i ) would equal the probability that the row (column) value for a randomly selected individual falls below the midpoint of row (column) category i (Agresti, 1984, p. 168).Therefore, for the analysis of square contingency tables with the same row and column ordinal classifications, we are now interested in proposing a quasi-symmetry model with the ridit scores r X i and r Y i , instead of scores u i .We are also interested in considering a decomposition of the S model using the ridit score type quasi-symmetry model.
The purpose of this paper is (1) to propose the ridit score type quasi-symmetry model, and (2) to give a decomposition of the S model using the proposed model.
2 Ridit Score Type Quasi-Symmetry Model Consider a square contingency table with the same row and column ordinal classifications.Let Thus, v i is the average of row and column ridits for category i.
, and these are unknown.Consider a model defined by where ψ ij = ψ ji .We shall refer to this model as the ridit score type quasi-symmetry (RQS) model.The RQS model may be expressed as This indicates that the log-odds that an observation will fall in the (i, j) cell instead of in the (j, i) cell, i < j, is proportional to the difference between v j and v i .
A special case of the RQS model obtained by putting θ = 1 is the S model.Also, the RQS model implies the QS model.Under the RQS model, θ > 1 is equivalent to Therefore the parameter θ in the RQS model would be useful for making inferences such as that X is stochastically less than Y or vice versa.
Let n ij denote the observed frequency in the ith row and jth column of the table Assume that a multinomial distribution applies to the R × R table.Denote the row and column marginal counts by s=1 n si , i = 1, . . ., R, respectively.The average ranks in category i are (Stuart, 1963;Agresti, 1984, p. 178).These are referred to the ith midrank.The ith midranks are related to the ith empirical ridits rX i and rY i by where rX i and rY i denote r X i and r Y i , respectively, with p st replaced by pst = n st /n.Dividing these midranks by the sample size n, yields the rank scores: Thus the empirical ridits rX i (r Y i ) and the rank scores s X i (s Y i ) are essentially the same for large n (Agresti, 1984, p. 178;Freeman, 1987, p. 120).
Therefore, the average of row and column empirical ridits for category i, Austrian Journal of Statistics, Vol. 38 (2009), No. 3, 183-192 is essentially the average of rank scores s X i and s Y i for category i for large n.Also, we see that Thus, under the RQS model with {p st } replaced by {p st }, we see that the estimated logodds that an observation will fall in the (i, j) cell instead of in the (j, i) cell, i < j, would be proportional to the difference between the average rank score for category j and that for category i.
Further, the RQS model may be expressed as Therefore, this model applied to the data indicates that the log-odds that an observation will fall in the (i, j) cell instead of in the (j, i) cell, i < j, is proportional to the sum of difference between the ridit scores for categories j and i for row variable X and the difference between the ridit scores for categories j and i for column variable Y .
The maximum likelihood estimates of expected frequencies under the RQS model could be obtained using the Newton-Raphson method in the log-likelihood equation (see Appendix).For the RQS model, , and 1 of θ, thus a total of R(R + 1)/2.Therefore, the number of degrees of freedom for the RQS model is (R 2 − 1) − R(R + 1)/2 = (R + 1)(R − 2)/2, which is one less than that for the S model, and equal to that for the LDPS model.

Decomposition of the Symmetry Model
We obtain the decomposition of the S model as follows: Theorem 1: The S model holds if and only if both the RQS and ME models hold.
Proof: If the S model holds, then the RQS and ME models hold.Assuming that both the RQS and ME models hold, then we shall show that the S model holds. For Then we have and K. Iki et al.

187
Thus we see If θ = 1 in the RQS model, we see that the S model holds.If θ > 1, we see Since the ME model holds, i.e., , we obtain θ = 1.Namely, the S model holds.The proof is completed.

Examples
Example 1: The data in Table 1, taken from Stuart (1955), are constructed from unaided distance vision of 7477 women aged 30-39 employed in Royal Ordnance factories in Britain from 1943 to 1946.These data have been analyzed by many statisticians, e.g., including Stuart (1955), Caussinus (1965), Bishop et al. (1975, p. 284), McCullagh (1978), Goodman (1979), Tomizawa (1993), and Tomizawa and Tahata (2007), etc.  Table 3: Likelihood ratio values G 2 for models applied to the data in Tables 1 and 2. greater than 1.We see that θv j −v i for i < j are greater than 1.Therefore, under the RQS model, the probability that the status category for the father in a pair is i and that for his son is j (> i), is estimated to be θv j −v i (> 1) times higher than the probability that the status category for the father is j and that for his son is i.In addition, since θ > 1, the marginal probability that the status category for the father is i or below (i = 1, . . ., 4) is estimated to be greater than the marginal probability that the status category for his son is i or below.Namely, the status category for the father rather than that for his son tends to be i or below (i = 1, . . ., 4).

Concluding Remarks
The RQS model applied to the data is based on the marginal ridits.The RQS model may be appropriate if it is supposed that the categories of the ordinal row and column variables represent intervals of an underlying continuous distribution.The readers may be interested in why one should use the RQS model instead of the QS model.The QS model indicates the structure that the odds-ratios are symmetric with respect to the main diagonal of the table.However, under the QS model we cannot infer that X is stochastically less than Y or vice versa.On the other hand, the RQS model implies the QS model and the parameter θ in the RQS model would be useful for making inferences such as that X is stochastically less than Y or vice versa (see Sections 2 and 4).Also, the QS model is considered for nominal categorical data, and the RQS model should be considered for ordinal categorical data, because the RQS model is not invariant under the same arbitrary permutations of the row and column categories.
Moreover, the RQS model rather than the LDPS (OQS) model may be appropriate when we cannot assign the integer scores (or known scores u 1 < • • • < u R ) to the categories for both row and column of same classifications.
Each of S, QS, LDPS (OQS), MH and ME models is saturated on the main diagonal cells of the table, but the RQS model is unsaturated on them.Thus, under the RQS model, the estimated expected frequencies on the main diagonal are always not equal to the observed frequencies on the main diagonal (see Tables 1 and 2).The RQS model may be useful when we want to utilize the information on the main diagonal.
The decomposition of the S model into the RQS and ME models, given by Theorem 1, would be useful for seeing the reason for its poor fit when the S model fits the data poorly.Indeed, for the data in Table 1, the poor fit of the S model is caused by the poor fit of the ME model rather than the RQS model, i.e., by the reason that the mean of grade of the right eye is different from the mean of grade of the left eye (see Example 1).

Table 1 :
Stuart (1955)nce vision of women fromStuart (1955).The upper and lower parenthesized values are the maximum likelihood estimates of expected frequencies under the LDPS and RQS models, respectively.

Table 2 :
Occupational status for father/son pairs; fromAgresti (1984, p. 206).The parenthesized values are the maximum likelihood estimates of expected frequencies under the RQS model.

Table 4 :
Maximum likelihood estimates of ridits {r X i }, {r Y i } and {v i } under the RQS model applied to the data in Tables1 and 2.