A Logistic Normal Mixture Model for Compositional Data Allowing Essential Zeros

John Bear, Dean Billheimer


The usual candidate distributions for modeling compositions, the Dirichlet and the logistic normal distribution, do not include zero components in their support. Methods have been developed and refined for dealing with zeros that are rounded, or due to a value being below a detection level. Methods have also been developed for zeros in compositions arising from count data. However, essential zeros, cases where a component is truly absent, in continuous compositions are still a problem.
The most promising approach is based on extending the logistic normal distribution to model essential zeros using a mixture of additive logistic normal distributions of different dimension, related by common parameters. We continue this approach, and by imposing an additional constraint, develop a likelihood, and show ways of estimating parameters for location and dispersion. The proposed likelihood, conditional on parameters for the probability of zeros, is a mixture of additive logistic normal distributions of different dimensions whose location and dispersion parameters are projections of a common location or dispersion parameter. For some simple special cases, we contrast the relative efficiency of different location estimators.

Full Text:



Aitchison, J. (1986). The statistical analysis of compositional data. Chapman & Hall, Ltd. Aitchison, J. (1994). Principles of compositional data analysis, pages 73–81. In Anderson et al. (1994).

Aitchison, J. and Egozcue, J. J. (2005). Compositional data analysis:where are we and where should we be heading? Mathematical Geology, 37(7), 829–850.

Aitchison, J. and Kay, J. W. (2003). Possible solution of some essential zero prob- lems in compositional data analysis. In Proceedings of CoDaWork 2003, The First Compositional Data Analysis Workshop. Universitat de Girona. Departament d’Informatica i Matematica Aplicada. http://ima.udg.edu/Activitats/CoDaWork03/ paper_Aitchison_and_Kay.pdf.

Anderson, T. W., Olkin, I., and Fang, K., editors (1994). Multivariate analysis and its applications. Institute of Mathematical Statistics, Hayward, CA.

Bacon-Shone, J. (2008). Discrete and continuous compositions.

Billheimer, D., Guttorp, P., and Fagan, W. F. (2001). Statistical interpretation of species composition. Journal of the American Statistical Association, 96(456), 1205–1214. http: //dx.doi.org/10.1198/016214501753381850.

Butler, A. and Glasbey, C. (2008). A latent gaussian model for compositional data with zeros. Journal of the Royal Statistical Society: Series C (Applied Statistics), 57(5), 505–520.

Daunis-i-Estadella, J. and Martin-Fernandez, J., editors (2008). Proceedings of CODAWORK 2008, The 3rd Compositional Data Analysis Workshop. University of Girona. CD-ROM (ISBN: 978-84-8458-272-4).

Daunis-i-Estadella, J., Martin-Fernandez, J., and Palarea-Albaladejo, J. (2008). Bayesian tools for count zeros in compositional data sets. In Proceedings of CODAWORK’08, The 3rd Compositional Data Analysis Workshop, May 27–30, University of Girona, Girona (Spain), CD-ROM .

Fry, J. M., Fry, T. R., and Mclaren, K. R. (2000). Compositional data analysis and zeros in micro data. Applied Economics, 32(8), 953–959. http://www.tandfonline.com/doi/ pdf/10.1080/000368400322002.

Greenacre, M. (2011). Measuring subcompositional incoherence. Mathematical Geo- sciences, 43, 681–693.

Kent, J. T. (1982). The Fisher–Bingham distribution on the sphere. Journal of the Royal Statistical Society, B, 44, 71–80.

Leininger, T. J., Gelfand, A. E., Allen, J. M., and Silander Jr, J. A. (2013). Spatial regression modeling for compositional data with many zeros. Journal of Agricultural, Biological, and Environmental Statistics, 18(3), 314–334.

Martin-Fernandez, J. A., Palarea-Albaladejo, J., and Olea, R. A. (2011). Dealing With Zeros, pages 43–58. In Pawlowsky-Glahn and Buccianti (2011).

Palarea-Albaladejo, J. and Martin-Fernandez, J. (2008). A modified em alr-algorithm for replacing rounded zeros in compositional data sets. Computers & Geosciences, 34(8), 902–917.

Palarea-Albaladejo, J., Martin-Fernandez, J. A., and Olea, R. A. (2014). A bootstrap estimation scheme for chemical compositional data with nondetects. Journal of Chemometrics, 28(7), 585–599.

Pawlowsky-Glahn, V. and Buccianti, A., editors (2011). Compositional Data Analysis: Theory and Applications. Wiley.

Scealy, J. and Welsh, A. (2011). Regression for compositional data by using distributions defined on the hypersphere. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(3), 351–375.

Scealy, J. and Welsh, A. (2014). Colours and cocktails: Compositional data analysis 2013 lancaster lecture. Australian & New Zealand Journal of Statistics, 56(2), 145–169.

Stewart, C. and Field, C. (2011). Managing the essential zeros in quantitative fatty acid signature analysis. Journal of Agricultural, Biological, and Environmental Statistics, 16(1), 45–69.

DOI: http://dx.doi.org/10.17713/ajs.v45i4.117


  • There are currently no refbacks.

@Matthias Templ (using Open Journal Systems) -- see previous editions at http://www.stat.tugraz.at/AJS/Editions.html