Changing the Reference Measure in the Simplex and Its Weighting Eﬀects

Under the assumption that the Aitchison geometry holds in the simplex, standard analysis of compositional data assumes a uniform distribution as reference measure of the space. Changing the reference measure induces a weighting of parts. The changes that appear in the algebraic-geometric structure of the simplex are analysed, as a step towards understanding the implications for elementary statistics of random compositions. Some of the standard tools in exploratory analysis of compositional data, such as center, variation matrix and biplots are studied in some detail, although further research is still needed. The main result is that through a progressive down-weighting of some parts, the geometry of the space approaches that of the corresponding subcomposition. In this way, the coherence between standard and down-weighted analyses is preserved.


Introduction
When analysing a composition, some parts may heavily influence the results.A typical example are inaccuracies in the measurements in some not fully relevant parts.They can dominate the analysis, producing a large contribution to variability or to distances.Also, relevance of some parts in a given problem can call for weighting techniques to adapt the simplex geometry accordingly.There are a number of weighting techniques that can be useful in this sense (e.g.Filzmoser and Hron 2015).Among them, the change of reference measure of the simplex has several implications that need to be fully understood for a consistent analysis.This contribution is aimed at showing changes that appear in the algebraic-geometric structure of the simplex, as well as some effects in elementary statistics and exploratory tools.
One of the most fruitful concepts in compositional analysis is that of subcomposition (Aitchison 1986).In Aitchison (1992), some reasonable principles for a coherent analysis of subcompositions were established.Beyond the idea that compositional analyses should be scale invariant, those principles included the assumption that distances between compositions should be greater than or equal to those observed in a subcomposition.This principle, called subcompositional dominance (Aitchison 1992;Aitchison, Barceló-Vidal, Martín-Fernández, and Pawlowsky-Glahn 2000;Egozcue 2009), highlights a change of the geometry of subcomposi-tions (for instance, a change in inter-distances between two data-points in the subcomposition) with respect to the original geometry of the full composition.Taking a subcomposition can be considered as an extreme case of down-weighting, since the influence of some parts of the composition is removed from the analysis.However, there are cases in which the complete removal of the influence of some parts of the original composition is not desirable.This motivates the idea of weighting compositions as a continuous transition from the full composition, endowed with the corresponding Aitchison geometry (Pawlowsky-Glahn and Egozcue 2001), to a subcomposition, endowed with the induced Aitchison geometry, which differs in dimension and metrics (distances, inner product, norm).
Apparently, there are many ways of weighting compositions so that a transition from a full composition to a subcomposition is performed.However, fulfilling all coherence requirements is quite challenging.One option deserving attention is the one proposed for Bayes spaces (Boogaart, Egozcue, and Pawlowsky-Glahn 2010; Egozcue, Pawlowsky-Glahn, Tolosana-Delgado, Ortego, and Boogaart 2013b) and, more specifically, for Bayes Hilbert spaces (Egozcue, Díaz-Barrero, and Pawlowsky-Glahn 2006;Boogaart, Egozcue, and Pawlowsky-Glahn 2014).Bayes Hilbert spaces are spaces of measures and densities, and their algebraicgeometric structure is an extension of the Aitchison geometry of the simplex.In fact, in (Boogaart et al. 2014), it is shown that the simplex, endowed with the Aitchison geometry, is a particular case of a Bayes Hilbert space.In the development of Bayes Hilbert spaces, a reference probability measure is introduced as a parameter regulating the geometry of the measures and densities in the space.This kind of approach provides a way of coherently introducing weighting strategies, both in the simplex and in the analysis of compositional data.The present aim is to start studying the change of reference measure in the simplex, being conscious that there is a long way from the general theory of Bayes Hilbert spaces to applications in compositional data analysis.Special attention is paid to the transition from the geometry of the simplex S D for compositions to the geometry of S d , d < D, where subcomposititions are defined.The main difficulties are interpretative, as usual in compositional data analysis.
The structure of the paper is as follows: Section 2 translates the milestones of Bayes Hilbert spaces into the case of compositions, with special emphasis on the role of the reference measure.Section 3 introduces the centered log-ratio transformation (clr) with respect to an arbitrary reference measure in the simplex, following the definition in Boogaart et al. (2014) for general Bayes Hilbert spaces.Section 4 gets into details of metric concepts under a change of the reference measure, such as orthogonality, bases, and balances.A proposition on dominance of distances is there stated (see proof in Appendix A).Section 5 gives an introduction to distributions of random compositions, their variability and centre under a weighted geometry of the simplex.Section 6 shows how variation matrix and biplots work under weighting using an example of electoral results.

Change of reference measure for compositions
Consider D categories c 1 , c 2 , . . ., c D ; they represent a partition of a measurable space Ω.A D-part composition x = (x 1 , x 2 , . . ., x D ) in the D-part simplex S D assigns a proportion x i to the category c i .Assuming that the composition x is closed to 1, the proportion assigned to the whole space Ω is just 1.For any subset of categories, the proportion assigned is the sum of the corresponding proportions.For instance, the proportion assigned to the subset {c 1 } is x 1 , and the proportion assigned to the subset {c 1 , c 2 } is x 1 + x 2 .From this point of view, the composition x defines a finite additive measure on Ω, which is denoted µ x {•}.The argument of this measure is any subset of Ω. Examples are Measures can be represented by densities.The idea is that sums (integrals) on a subset of Ω give the measure of this subset.In the case of the simplex S D , the density is identified with the composition x, as for any subset A ⊆ Ω it satisfies where the uniform measure P 0 {•} on Ω has been made explicit as reference measure.Note that P 0 {Ω} = D and addends of sums (integrals) along the composition are equally weighted with 1 = P 0 {c i }.The reference measure specified as p 0 = (P 0 {c 1 }, P 0 {c 2 }, . . ., P 0 {c D }) is a nonclosed uniform measure.Therefore, it is compositionally equivalent to the neutral element of the simplex n = (1/D, 1/D, . . ., 1/D).The conclusion is that a composition x ∈ S D defines a measure µ x on Ω specifying the measure of each elementary subset {c i } and, at the same time, x is the density of µ x with respect to the uniform reference measure P 0 , which density is p 0 .In mathematical terms, the density (composition) x is the Radon-Nikodym derivative of µ x with respect to the reference measure P 0 which can be written as for any A ⊆ Ω.When P 0 is the unitary and uniform reference measure, there is no need to distinguish between x as a composition, as a measure or as a density.These facts change when weights are introduced through the reference measure.
To analyse the effects of a change of reference measure as a means to introduce weights, consider an arbitrary array of positive weights, p = (p 1 , p 2 , . . ., p D ).The corresponding measure P is then characterised by P {c i } = p i , for i = 1, 2, . . ., D, and by the measure of the whole space, P {Ω} = D i=1 p i .Note that p is the density of P with respect to the uniform measure P 0 .A question is now to look for the density of the measure µ x with respect to the new reference measure P .This density is y The measure µ x is thus retrieved from two different densities, x when considering the uniform reference P 0 , and y for a reference P .Note that y is a vector which components do not add to one, i.e. it is not closed.However, it is compositionally equivalent to Cy = x p, as its components are proportional (Pawlowsky-Glahn, Egozcue, and Tolosana-Delgado 2015).
If the reference measure P is represented by the vector of weights p, the composition Cy is just a perturbation of x, a shift in the simplex, recalling that the perturbation-difference includes the closure, C, and, consequently, Cy = x p = x Cp.From now on, the non-closed version of y is denoted y (p) when the reference measure needs to be specified.Following Boogaart et al. (2010) andBoogaart et al. (2014), a weighted perturbation and powering can be defined for densities like y (p) such that they operate linearly in the weighted simplex.However, their use is not recommended in this context as standard perturbation (⊕) and powering ( ) are easily interpreted and computed in the applications.This avoids linear operations with the shifted densities Cy (p) = x p .In practice, weighted compositions will be used only in the computation of distances and inner products, as explained below.

Centred log-ratio with respect to a reference measure
In Boogaart et al. (2014), the clr-transformation of a density f with respect to a given reference measure P , is defined as where Ω is the measurable set where the density f is defined.In the present case, Ω is the set of the D parts or categories of S D , namely c i , i = 1, 2, . . ., D. Therefore, the values of x Changing Reference Measure in the Simplex in such an expression correspond to the c i 's.Since f is a density of a measure with respect to the reference measure P , it can be identified with the density y = x/p , as introduced in Section 2. With these identifications, the clr p -transformation of the simplex with respect to the measure P , represented by p = (p 1 , p 2 , . . ., p D ), is clr p (y) = log y 1 g p (y) , log y 2 g p (y) , . . ., log y D g p (y) , where s p = D i=1 p i , and g p (•) denotes a weighted geometric mean of the parts y i .It is remarkable that p, the reference measure of the categories c i , is not closed to D, and that P {Ω} = s p , while for P 0 the uniform reference measure s p 0 = D .Note also that y can be closed or not, as Equation 3 is scale invariant.
An important characteristic of clr p (y) is that the weighted sum of its D components is zero, that is generalising the ordinary clr in S D , for which the sum of its components (weights equal to 1) is zero.This has a geometric interpretation in the space R D , where a point has coordinates log(y) = (log y 1 , log y 2 , . . ., log y D ).As illustrated in Figure 1, which shows a scheme for D = 2, to obtain the ordinary clr of a generic point log(y), the point is orthogonally projected onto a hyperplane through the origin whose orthogonal vector is (1, 1, . . ., 1) (Aitchison 1986;Pawlowsky-Glahn et al. 2015).When using a non-uniform p = (p 1 , p 2 , . . ., p D ) the procedure to get clr p (y) is to orthogonally project the point log(y) onto a hyperplane whose orthogonal vector is p, as shown by the inner product in R D implicit in Equation 4. Summarising, clr p is a projection of log(y) on a hyperplane whose normal vector is p.A particular case of interest is that of for which P {Ω} = (D − 1) + .When → 0, the D-th part is down-weighted from 1 to 1.For small enough , the weighted geometric mean g p in Equation 3 approaches the ordinary geometric mean of the first D − 1 parts of y.A consequence is that the first D − 1 components of clr p (y) approach the ordinary clr of the subcomposition formed by (y 1 , y 2 , . . ., y D−1 ).This suggests that this kind of reference measures may approach the induced Aitchison geometry on the subcomposition.

Metrics under change of reference
The clr transformation can be used to define the inner product in S D , as was done in Bayes Hilbert spaces (Boogaart et al. 2014, Def. 2).There, the proposed definition was where •, • is the ordinary inner product in R D .This definition leads to an inner product in S D which, for a uniform reference measure P 0 , with weights p 0 = (1, 1, . . ., 1), is which is not the standard in compositional data analysis due to the factor 1/D.This inner product is not suitable for compositional data analysis, as it does not fulfill the principle of subcompositional dominance of distances.For instance, consider the 3-part compositions u = (0.1, 0.7, 0.2) and v = (1/3, 1/3, 1/3).Their distance, in the geometry induced by the inner product (6) in S 3 , is d 3 (u, v) = 0.805.Taking the subcomposition formed by the first and second part and computing the distance in S 2 according to ( 6), the result is The discussion about the role of the constant 1/D in the inner product is related to the fact that in Boogaart et al. (2014) the reference measure was assumed to satisfy P 0 {Ω} = 1.If 0 < P 0 {Ω} < +∞, the value P 0 {Ω} is irrelevant when one does not try to compare results of an analysis using different reference measures, as was the case in that contribution.On the contrary, in Egozcue et al. (2006) the reference is implicitly assumed to be proportional to the length of the interval supporting the densities of the Hilbert space, that is P 0 {Ω} is adapted for each support Ω.Here this second strategy has been adopted so that analytical results using different references become comparable, fulfilling the subcompositional coherence requirements.This strategy of normalizing the reference measures has a consequence which might be uncomfortable for some readers, namely that p 0 , or in general p, are not only nonclosed compositions, but convey also information about the size of Ω, P {Ω} = D i=1 P {c i }.In the following development, p or p 0 appear to be closed when represented as elements of the simplex, but retain their absolute values when the components are used as weights in sums (integrals) along compositions or clr images.
To match the present definition to the standard practice in compositional data analysis (Aitchison 1986;Aitchison and Egozcue 2005;Egozcue, Barceló-Vidal, Martín-Fernández, Jarauta-Bragulat, Díaz-Barrero, and Mateu-Figueras 2011;Pawlowsky-Glahn et al. 2015) and to the subcompositional dominance of distances, the factor 1/D in ( 6) is suppressed.Remember that multiplication by a real scalar in an inner product does not change its character.
In the case of using a reference measure represented by the weights p, the appropriate definition of the weighted Aitchison inner product is where The expression in the right hand side of Equation ( 7) is an inner product of the clr p as real vectors with respect to the measure P .
The weighted Aitchison norm, derived from the inner product, is y 2 p = y, y p , and an explicit expression of the distance is This expression of weighted distance can be written in matrix notation , where the clr p are row vectors and diag(p) is a diagonal (D, D)-matrix containing the weights p.These definitions coincide with those of the ordinary Aitchison geometry of S D whenever p = p 0 = (1, 1, . . ., 1).When p = p 0 , the inner product differs from the ordinary Aitchison inner product and, consequently, also norm and distance are different.
To get a further intuition of what is changing with p, it is instructive to build orthonormal basis of the simplex according to the change of reference.It allows to show how these bases appear under a change of p in particular cases.
A straightforward technique for obtaining orthonormal basis of the simplex and their respective coordinates (Egozcue, Pawlowsky-Glahn, Mateu-Figueras, and Barceló-Vidal 2003) is that of sequential binary partitions (SBP) (Egozcue andPawlowsky-Glahn 2005, 2006).Like in the standard case (reference measure P 0 ), when using a reference measure with the weights p, the procedure is based on a partition coded as in Table 1, but the formulae to obtain the contrast matrix are modified.Table 1 shows a generic sign code for an SBP, adding weights p as column labels (second row) for further comment on the generalised technique.
Denote the entries of the matrix code as θ ij , i = 1, 2, . . ., D − 1, j = 1, 2, . . ., D. For the case in Table 1, D = 5 and, for instance, θ 32 = +1.When using the standard reference measure p 0 = (1, 1, . . ., 1), the clr coefficients of an element of the basis, that is of a balancing element, are given by where n + i denotes the number of +1, respectively n − i of −1, in the i-th row of the code table.When using the reference measure which weights p j are not unity, these formulas for the clr p of balancing elements are the same except that n + i , n − i are The contrast matrix Ψ, with entries where I m is the (m, m)-identity matrix; p and 1 = (1, 1, . . ., 1) are taken as row D-vectors, and diag(p) is a (D, D) diagonal matrix with entries equal to the components of p.The first condition is equivalent to saying that balancing elements are unitary compositions mutually orthogonal.In fact, their clr p are unitary and orthogonal in the weighted Euclidean geometry.
Coordinates of a density y ∈ S D with respect to an orthonormal basis are found carrying out the inner product of a balancing element in the basis with the density y = x/p.In general, these coordinates are termed weighted isometric log-ratio coordinates and denoted by ilr p .In the particular case in which they are obtained using an SBP, they are called weighted balances.For simplicity, these weighted balances are denoted b i , i = 1, 2, . . ., D − 1, with no reference to the weights associated with the change of measure (as shown in Table 1).The ilr p coordinates can be obtained using the matrix expression Although Equation 10 is useful from a computational point of view, an explicit expression of balances gives a deeper insight into the meaning of weighted balances.Consider a sign code of a step in an SBP, for which n + i , n − i are given.The corresponding weighted balance is where the products span over the parts corresponding to the sign code θ ij .When the weights p j = 1, the balance reduces to the standard balances, as n + i , n − i are then the number of +1 and −1 in the i-th row of the sign code, respectively.The main feature, when the reference is not p 0 , is that the ratios within the logarithm are ratios of a kind of weighted geometric means.Note that, in general, n + i , n − i are not integers and each part is powered to the weight corresponding to that part.When some p j is small, relative to other weights, it plays a minor role in these weighted geometric means.Furthermore, the weighted balances are scale invariant log-contrasts, that is, if the composition y is multiplied by a positive constant, the weighted balance remains unaltered.
Expressing inner products, norms, and distances as functions of weighted coordinates ilr p can be useful, because they are exactly those of the standard Euclidean geometry.For the inner product and square-distance they are where •, • , d(•, •), are the ordinary Euclidean inner product and distance.
Whenever there is a change in the geometry of compositions, the subcompositional dominance of distances is a critical point.In the standard approach, the distance between any two compositions x 1 , x 2 ∈ S D is d a (x 1 , x 2 ).After taking a given subcomposition in S d , d < D, the distance between the respective subcompositions, x 2 , satisfies d a (x 2 ) ≤ d a (x 1 , x 2 ).In this case, both spaces have integer reference measures with P {Ω D } = D and P {Ω d } = d and, for D = 3, d = 2 the corresponding weights are (1, 1, 1) and (1, 1, 0), respectively.When changing the reference measure by down weighting some of the weights, a dominance of distances is expected, as it occurs when taking subcompositions.The dominance of distances Figure 2: Evolution of weighted square-distances between three measures represented by the compositions x 1 = (0.1, 0.7, 0.2), x 2 = (0.5, 0.3, 0.2), x 3 = (0.9, 0.08, 0.02) ∈ S 3 with respect to the reference measure P 0 with weights p 0 = (1, 1, 1).With and x-axis is scaled as P {Ω} = 1 + 1 + .The three square-distances monotonically increase from P {Ω} = 2 to P {Ω} = 3.The end points of the curves at P {Ω} = 2 and P {Ω} = 3 are equal to standard Aitchison square-distances in S 2 and S 3 respectively.can be stated as follows.
It is worth to remark that the notation of distances like d p 1 (y ) could be changed to d p 1 (x 1 , x 2 ), as distances assigned to shifted y's are equal to those of the original compositions x's.This is due to the fact that x and y are densities of the same measure, namely µ x , with respect to different reference measures.
An experiment has been conducted to show how the changes of reference modify distances and shapes.Five different reference measures p = (1, 1, ) have been considered with equal to 1, 0.5, 0.1, 0.05, 0.01, so that they approach progressively the geometry of the subcomposition of the two first parts.The unit circle centered at the neutral element was shifted by the five reference measures.Figure 3, shows this unit circle (black) and the sequence of perturbations as a consequence of the change of origin.Note that the transformed circle is shifted to the vertex which weight is reduced, as expected after dividing each part by the corresponding weight.
After the change of origin, each point on the circles was ilr-transformed using the corresponding weights according to the SBP sign code , which has been selected to avoid a balance representing the subcomposition (y 1 , y 2 ). Figure 4 (left panel) shows the coordinates of the circles, to show the changes of the distances between points on the same circle.Note that the centers of the ellipses do not coincide, as they correspond to the closure of the reference measure (1, 1, ).The main feature is the progressive stretch of the original circle.For very small the ellipse tends to degenerate into a segment following the direction of the subcomposition (y 1 , y 2 ).Similarly, Figure 4 (right panel) shows the deformation of a grid originally at −1, 0, 1 in both axes (black).The new references are = 0.1 (blue), and 0.01 (red).The grid is progressively tilted and distances between nodes decrease as decreases.Although straight-lines are preserved, their angles change, thus showing the change of geometry when changing the reference.

Elementary statistics
The change of reference measure and its associated weighting have consequences in the definitions of elementary concepts of compositional statistics.Variability and center are the two main concepts examined below.Both concepts are redefined following previous developments in the statistical analysis of compositional data, just looking for the influence of the weighting.These new definitions are intended to match the standard concepts whenever the weights are unity over the categories defining the composition.3 after weighted ilr-transformation.Reference measures are (1, 1, ), = 1 (black), = 0.5 (brown), 0.1 (blue), 0.05 (green), and 0.01 (red).Right panel: a regular grid at points −1, 0, −1 in both axes after change of origin and weighted ilr transformation.Weights are (1, 1, ), = 1 (black), 0.1 (blue), and 0.01 (red).
Let X be a random composition (density) in S D (Pawlowsky-Glahn et al. 2015, ch. 6) which, for some selected ilr coordinates denoted X * in R D−1 , is absolutely continuous with joint probability density (pdf) f * X .Therefore, f * X (x) is a function defined on R D−1 , the space of the ilr coordinates, with the standard definitions from probability theory.Assume also that a new reference measure is chosen and it is represented by a set of positive weights p. Accordingly, the random composition Y = X p corresponds to the change of reference and its distribution only differs from that of X in a shift of the center.The ilr p coordinates of Y, denoted Y * , are also random, but their distribution on R D−1 is a transformation of the previous pdf f * X , here denoted as f * , where the subscript is dropped when it corresponds to the composition Y.In Appendix B it is shown that the transformation from X * = ilr(X) to Y * = ilr p (Y) is a linear (affine) transformation.For instance, this means that, if X has a normal distribution on the simplex (Mateu-Figueras, Pawlowsky-Glahn, and Egozcue 2013; Pawlowsky-Glahn et al. 2015) and, thus, X * is multivariate normal on R D−1 , the distribution of Y * is also a multivariate normal on R D−1 .As a conclusion, the normality of ilr p coordinates is maintained when the weights p of the reference measure change.
Following the general formalism developed by Fréchet (1948) for metric spaces, the first milestone to be defined is the (total) variability of Y with respect to an arbitrary point η ∈ S D .It is defined as 12), the minimum of Var p [Y; η] is attained for η * = E[Y * ], a standard result in real multivariate statistics.Based on this result, the weighted center and total variance are Note that this kind of approach has been used in Pawlowsky-Glahn and Egozcue (2001) and in Boogaart and Tolosana-Delgado ( 2013), but total variance is there called metric variance.13, the weighted center of a random composition only depends on the weights in p through the shift applied, that is

Despite the previous expression of Cen p [Y] in Equation
where Cen and ⊕ are the ordinary center and perturbation of random compositions, respectively, and Y = X p, thus enhancing the linearity of expectations.
Decompositions of total variance underlays many standard statistical methods, thus remarking its upmost importance.Equation 13 leads to decompositions of the total variance when the reference measure p is not p 0 .Similarly to those described in Egozcue et al. (2011), we obtain where s p = D i=1 p i , ilr p,i (Y) = y * i and clr p,i (Y) is the i-th component of clr p (Y).Note that the decomposition of totVar p [Y] into ilr p variance components points out that totVar p [Y] is the trace of the covariance matrix of ilr p (Y), and that totVar p [Y] is not the sum of clr p variances, but a weighted sum of them.
The decompositions of the total variance are closely related to the relationships between the covariance matrices of the ilr p coordinates and the clr p coefficients.These relationships can be summarized as where Ψ is the (D − 1, D)-contrast matrix of the ilr p , Σ p is the covariance matrix of Y * and Σ c p is the covariance matrix of clr p (Y).Also, the variation matrix (Aitchison 1986) plays an important role in the statistics of compositional data.Its entries are variances of simple log-ratios, ln(X i /X j ).At least, it has two important uses: (a) it constitutes a simple and interpretable representation of the variability (second order moments) of the random composition, identifying the binary sources of variability relative to the total variance; and (b) each entry of the variation matrix is a measure of the compositional dissociation, as opposite of association, between the two parts involved.Point (a) is reflected in the fact that the covariance matrices of ilr-coordinates and clr coefficients can be retrieved from the variation matrix (Pawlowsky-Glahn et al. 2015, Appendix A).Concerning point (b), large entries, relative to other entries, point out most dissociated pairs of parts.The measurement of compositional association of two parts, understood as proportionality between them, is motivated by the fact that Var[ln(X i /X j )] = 0 implies that X i and X j are strictly proportional (Egozcue, Lovell, and Pawlowsky-Glahn 2013a;Lovell, Pawlowsky-Glahn, Egozcue, Marguerat, and Bähler 2015).
Inspired by the third decomposition of weighted total variance in Equation 14, a weighted variation matrix can be defined as a (D, D)-matrix T p with entries The relationship of T p with the covariance matrix of ilr p coordinates is The decomposition of weighted total variance and the relationships between covariance matrices reduce to the standard ones whenever the reference measure is P = P 0 , that is, whenever p = (1, 1, . . ., 1).

Exploratory tools
In compositional data analysis, the main specific exploratory tools are the variation matrix (Aitchison 1986), principal component analysis of the clr transformed compositional sample (Aitchison 1983) and its corresponding biplots (Aitchison and Greenacre 2002), and the compositional dendrogram (Pawlowsky-Glahn and Egozcue 2011).These three tools take slightly different forms when taking a reference measure different from P 0 .In order to show how to use and interpret the weighted versions in an exploratory analysis, the data from the Catalan parliament (Spain) elections in November 2010 (Cat10) have been selected.This data set was previously analysed in Egozcue and Pawlowsky-Glahn (2011) (see also Pawlowsky-Glahn et al. 2015).
The data set Cat10 contains the number of votes obtained by several parties, including abstention (abs), null (null) and none of the above or blank votes (nota) in n = 41 electoral districts.The major parties contesting the elections were Convergència i Unió (CiU), Partit dels Socialistes de Catalunya (PSC), Ciutadans-Partido de la Ciudadanía (C's), Esquerra Republicana de Catalunya (ERC), Iniciativa per Catalunya Verds-Esquerra Unida i Alternativa (ICV) and Partit Popular (PP).Other minor parties are amalgamated in other.The present analysis focusses on the whole composition of votes, that is, the D = 10 parts of the composition: abs, nota, null, CiU, C's, ERC, ICV, PP, PSC, other.
A first step in exploratory analysis is to choose suitable weights for the 10 parts involved.The situation in most political elections is that votes to parties show a homogeneous preference to a given party, meanwhile "abs", "nota", "null" and "other" mix non-homogeneous support to democratic elections or other situations, thus suggesting to weight them differently.Well defined parties were weighted by 1.The abstention is the more heterogeneous group of electors and the choice for its weight was 0.1.The electors that choose blank vote (nota) and null vote (null) can be considered less heterogeneous than abstention, as they express something similar to "I want to vote, but none of the contesting parties convinced me"; these two categories have been weighted by 0.3.Votes to parties included in "other" are well defined, but directed to different parties with different programmes; there is a well defined intention in the vote, but the amalgamation of different parties makes the group heterogeneous; the category "other" is weighted by 0.5.The vector of weights p chosen is shown in the second row of Table 2.These weights have been chosen to show the effects of weighting, and not to carry out a sound analysis of the data set.Methods to establish suitable weights should be object of further research.The third row of Table 2 shows an alternative set of weights p (sub) i that will be used only for illustrating how these weights make the analysis to be close to that of a subcomposition of the well defined parties.The forth row of Table 2 shows the center of the composition, expressed in percent.The fifth row is the center Cen p [Y] (also in percent), which is not useful for interpretation, but for comparison with Cen[X].Note how the percent of "abs", with weight 0.1, increased when dividing by the weight.The same fact may occur for all parts with weights less than one, but closure hides this fact.Note that the center is a composition of a "mean electoral district", and that variability around this center may be large.This can be checked, for instance, on C's, which minimum percentage is 0.3% and its maximum is 2.8% across the sample of electoral districts, what in turns may represent a number of electors from 3046 up to 1,572,425 for the surroundings of Barcelona.Therefore, reporting mean values or centers needs to be complemented with the analysis of variability.
The weighted variation matrix is shown in the upper triangle of Table 3.In the lower triangle of Table 3, the cross products of weights p i p j are specified.When the entries of the weighted variation matrix are divided by the corresponding p i p j they result in the corresponding entry of the traditional variation matrix with reference P 0 .Terms in the weighted variation matrix larger than or equal to 0.30 are highlighted in boldface.They constitute the larger sources of variability in the data set.Most of them correspond to C's, whose votes are irregularly distributed over electoral districts.This fact is confirmed by the weighted clr p variances, as the largest value corresponds to C's as well.Small values in the weighted variation matrix suggest association between parts, i.e. approximate proportionality, although this needs further analysis to be confirmed (Egozcue et al. 2013a;Lovell et al. 2015).The strongest associations appear between abs, nota, null, with traditionally nationalist parties in Catalonia, i.e.CiU, ERC, and even with PSC.Compared to the variation matrix published in Egozcue and Pawlowsky-Glahn (2011), the possible associations appear stronger in Table 3.This is due to the fact that the 2011 analysis was performed without any weighting in the reference.Differences in the variances of simple log-ratios of not down-weighted parts are the consequence of dividing entries in Table 3 by n − 1 = 40, while in 2011 the divisor was n = 41.The weighted total variance is 0.836, smaller than that obtained with unit weights (1.020), using in both cases the same divisor (n − 1 = 40).
In compositional data analysis, principal component analysis (PCA) is commonly performed using the singular value decomposition (SVD) of the clr-transformed data set (Aitchison 1983).The scores, multiplied by the singular values, are proportional to ilr-coordinates, such that their variances are proportional to the square singular values.The loadings matrix contains the clr representation of the principal directions.The last singular value is zero, as the clr data sum to zero for each data point.Similar features are expected for a PCA performed on a weighted composition using its weighted clr p transformed values.However, when the clr p -transformed data set is SVD-decomposed, the square singular values are no longer proportional to ilr p variances and they do not provide a decomposition of the weighted total variance.The way proposed here consists of dividing the clr p data previous to SVD, so that resulting square singular values add to the total variance.
Let X be a compositional data set in S D ; therefore, X is a (n, D)-matrix and n is the size of the sample.After selecting some positive weights, p, each row of the data matrix is accordingly shifted and is written as Y = X p. Applying the clr p transformation to each row yields clr p (Y ).This clr p -transformed data set is centered and weighted with the square-root of the weights in p, that is where clr p (Y ) denotes the average by columns of clr p (Y ).The SVD of A, has the standard properties of an SVD.Among these properties, some of them are reinterpreted in the compositional framework.The singular values contained in the diagonal matrix Λ = diag(λ 1 , λ 2 , . . ., λ D−1 , 0) are positive and in decreasing order of magnitude; the last one is zero due to the property of the clr p (Y ) that the weighted sum of its components adds to zero (Equation 4).The non-standardized scores U Λ are ilr p -coordinates whose sample variances are λ 2 i /(n − 1).The sample total variance is totVar p (Y ) = D−1 i=1 λ 2 i /(n − 1).The D − 1 first columns of V contain the contrast matrix corresponding to the ilr p .The loadings are given by the columns of diag(1/ √ p)V Λ, where diag(1/ √ p) appears to compensate the previous weighting in A.
A covariance biplot (Aitchison and Greenacre 2002) is a simultaneous projection of U (scores) and V Λ (loadings) onto two principal directions, usually the first two.The percent of weighted total variance explained in such a projection is given by 100 This kind of biplots have been obtained for the data set Cat10. Figure 5 shows four different cases: top-left panel shows the biplot when the reference is p 0 = (1, 1, . . ., 1); top-right panel adopts the weights p i shown in Table 2; bottom-left panel shows the biplot when using p (sub) i also shown in Table 2 (third row).Finally, the bottom-right panel shows the biplot obtained using the subcomposition of individual parties, excluding "abs", "nota", "null", and "other", and using the reference p 0 for the subcomposition.
The first impression is that the two biplots in the upper part of Figure 5 appear to be quite similar, as the main features are preserved.In fact, the clr-variables corresponding to well defined parties are projected very similarly.For instance, the first principal axis is dominated by the clr-variables corresponding to C's on one side, and CiU and ERC on the opposite side, which can be identified with a balance of non-nationalist versus nationalist Catalan parties; this fact was previously observed in the weighted variation matrix.The second principal axis is mainly influenced by the links between PP-ICV and PSC-ICV, leading to identify the second principal axis with a balance of right versus left wing parties.In fact, the three parties involved are perceived by electors as right wing (PP), very moderate social-democratic (PSC) and left wing (ICV).However, when looking at the clr-variables corresponding to downweighted parts (abs, null, nota, other), the shortening of the corresponding parts proportional to √ p i is apparent.For example, the role of clr-other in the projection has been reduced in an appreciable way.
In the bottom-left panel of Figure 5, the weights p (sub) i (Table 2) have been used in order to approach a subcompositional analysis of the parties C's, CiU, ERC, ICV, PP, PSC.As the rest of the parts are severely down-weighted, they appear as very short rays from the origin (labels are overlapping).Compared with the subcompositional analysis (bottom-right panel, Figure 5), it is clear that, exception made of these short rays, the rest is almost identical in the two bottom biplots.See, for instance, that the total variance of the two cases are, respectively, 0.7020 (weights p (sub) i ) and 0.7016 (unit weights in the subcomposition) and q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −0.6 −0.ICV PSC PP other q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −2 −1 0 1 2 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 w−covariance biplot first axis, var% 73.5 second axis, var% 11.9 Abs Nota Null C's CiU ERC ICV PSC PP other q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −10 0 10 −15 −10 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −0.  2, weighted total variance 0.836.Bottom-left panel: extreme weighting, given in Table 2, weighted total variance 0.7020.Bottom-right panel: subcomposition of parties, total variance 0.7016.
the corresponding proportions of explained total variance in the two dimensional projections are very close.This illustrates the fact, that down-weighting some parts is a path towards subcompositional analysis.
The fact that the projection changes only slightly from top to bottom of Figure 5 indicates that most of the variance introduced by "abs" is small (see Table 3) and that of "nota" and "null" is not well represented in the first and second principal axes.A feature that is clear in the weighted biplot (top-right panel) is that the link "null-other" is almost parallel to the second axis and to the link PSC-ICV: the variance of this two log-ratios are mainly included in the second principal component.The "nota" and "null" votes are quite associated one to each other across electoral districts as the rays appear almost parallel (see also Table 3).When they are down weighted (top-right panel) the main effect is that the corresponding rays are equally shortened as the weights were equal for these two parts.
The so called balance-dendrogram is not discussed here in detail, as the changes to be incorporated when using weights are quite obvious.Firstly, a balance-dendrogram presents a hierarchical structure describing an SBP, which in the weighted case is identical to the standard case.The decomposition of the total variance changes quantitatively with weighting, as indicated in Equation 14(second member).Finally, the position of mean balances is substituted by the new mean weighted balances.However, the qualitative structure of the dendrogram remains the same.
The present study of different exploratory tools for compositional data analysis is only preliminary.Details on interpretation and methods to assess weights require further study.

Conclusions and further research
A weighting strategy for the analysis of compositions is proposed.It is based on the theory of Bayes Hilbert spaces.However, some modifications have been introduced to fulfill the principle of dominance of distances when down-weighting some parts of the composition.
When the weights considered are unitary in each part, that is, when there is no down or up-weighting, the approach is reduced to the standard compositional data analysis.If some parts are down-weighted approaching zero, the weighted geometry of the simplex tends to the ordinary Aitchison geometry of the corresponding subcomposition.
In order to use the proposed weighting approach, it is advisable to deal with compositional data as usual for linear operations, using the standard perturbation and powering.When distances or inner products are involved in the analysis, they are computed in two steps: first, shifting the compositional data by p, that is, dividing each part by the corresponding weight; and second, computing clr p (Equations 3 or 16) or ilr p (Equation 10) to find the required distances or inner products in a straightforward way.
Statistical consequences of weighting compositions need to be studied in the future.Standard tools of exploratory analysis, as variation matrix, biplots or balance-dendrogram, clustering and others, will be influenced by weighting.The reason is that distances between compositions and computation of variances-covariances are influenced as well.Thus, the proposed weighting approach is only a first step towards developing effective weighting techniques applicable to compositional data analysis.
where compositions and their clr p and ilr p transforms are considered row-vectors.Note that each component of b = (b 1 , b 2 , . . ., b D−1 ) is a weighted inner product of clr p (y) with the corresponding clr p of a balancing element.The inverse ilr p transformation is readily obtained using the properties (9) of Ψ Cy = C exp(ilr p (y)Ψ) , clr p (y) = ilr p (y)Ψ , being the first of these relations formally identical to the standard inverse ilr with reference measure P 0 .The relationship of ilr p (y) and ilr(x) is developed in Appendix B.

Table 1 :
A generic table of an SBP for a five-part composition.Weights from the reference measure are placed in the second row, under the part label.Rows are labelled as balances b i for further reference.parts y 1 y 2 y 3 y 4 y 5 weights p 1 p 2 p 3 p 4

Table 2 :
Weights, p (second row) and p are only used in an example of biplot.Center of the composition, expressed in percent, for the original composition (forth row), and for the shifted composition Y = X p (fifth row).

Table 3 :
Weighted variation matrix for Cat10 data.Last column: weighted clr p variances, p i Var(clr p,i [Y]), adding to weighted total variance.Upper triangle: elements of the weighted variation matrix (values greater than or equal to 0.30 are highlighted in boldface).Lower triangle: product of weights p i p j .Two last rows: weighted total variance and total variance (uniform reference).