A Logistic Normal Mixture Model for Compositional Data Allowing Essential Zeros
AbstractThe usual candidate distributions for modeling compositions, the Dirichlet and the logistic normal distribution, do not include zero components in their support. Methods have been developed and refined for dealing with zeros that are rounded, or due to a value being below a detection level. Methods have also been developed for zeros in compositions arising from count data. However, essential zeros, cases where a component is truly absent, in continuous compositions are still a problem.
The most promising approach is based on extending the logistic normal distribution to model essential zeros using a mixture of additive logistic normal distributions of different dimension, related by common parameters. We continue this approach, and by imposing an additional constraint, develop a likelihood, and show ways of estimating parameters for location and dispersion. The proposed likelihood, conditional on parameters for the probability of zeros, is a mixture of additive logistic normal distributions of different dimensions whose location and dispersion parameters are projections of a common location or dispersion parameter. For some simple special cases, we contrast the relative efficiency of different location estimators.
Aitchison, J. and Egozcue, J. J. (2005). Compositional data analysis:where are we and where should we be heading? Mathematical Geology, 37(7), 829–850.
Aitchison, J. and Kay, J. W. (2003). Possible solution of some essential zero prob- lems in compositional data analysis. In Proceedings of CoDaWork 2003, The First Compositional Data Analysis Workshop. Universitat de Girona. Departament d’Informatica i Matematica Aplicada. http://ima.udg.edu/Activitats/CoDaWork03/ paper_Aitchison_and_Kay.pdf.
Anderson, T. W., Olkin, I., and Fang, K., editors (1994). Multivariate analysis and its applications. Institute of Mathematical Statistics, Hayward, CA.
Bacon-Shone, J. (2008). Discrete and continuous compositions.
Billheimer, D., Guttorp, P., and Fagan, W. F. (2001). Statistical interpretation of species composition. Journal of the American Statistical Association, 96(456), 1205–1214. http: //dx.doi.org/10.1198/016214501753381850.
Butler, A. and Glasbey, C. (2008). A latent gaussian model for compositional data with zeros. Journal of the Royal Statistical Society: Series C (Applied Statistics), 57(5), 505–520.
Daunis-i-Estadella, J. and Martin-Fernandez, J., editors (2008). Proceedings of CODAWORK 2008, The 3rd Compositional Data Analysis Workshop. University of Girona. CD-ROM (ISBN: 978-84-8458-272-4).
Daunis-i-Estadella, J., Martin-Fernandez, J., and Palarea-Albaladejo, J. (2008). Bayesian tools for count zeros in compositional data sets. In Proceedings of CODAWORK’08, The 3rd Compositional Data Analysis Workshop, May 27–30, University of Girona, Girona (Spain), CD-ROM .
Fry, J. M., Fry, T. R., and Mclaren, K. R. (2000). Compositional data analysis and zeros in micro data. Applied Economics, 32(8), 953–959. http://www.tandfonline.com/doi/ pdf/10.1080/000368400322002.
Greenacre, M. (2011). Measuring subcompositional incoherence. Mathematical Geo- sciences, 43, 681–693.
Kent, J. T. (1982). The Fisher–Bingham distribution on the sphere. Journal of the Royal Statistical Society, B, 44, 71–80.
Leininger, T. J., Gelfand, A. E., Allen, J. M., and Silander Jr, J. A. (2013). Spatial regression modeling for compositional data with many zeros. Journal of Agricultural, Biological, and Environmental Statistics, 18(3), 314–334.
Martin-Fernandez, J. A., Palarea-Albaladejo, J., and Olea, R. A. (2011). Dealing With Zeros, pages 43–58. In Pawlowsky-Glahn and Buccianti (2011).
Palarea-Albaladejo, J. and Martin-Fernandez, J. (2008). A modified em alr-algorithm for replacing rounded zeros in compositional data sets. Computers & Geosciences, 34(8), 902–917.
Palarea-Albaladejo, J., Martin-Fernandez, J. A., and Olea, R. A. (2014). A bootstrap estimation scheme for chemical compositional data with nondetects. Journal of Chemometrics, 28(7), 585–599.
Pawlowsky-Glahn, V. and Buccianti, A., editors (2011). Compositional Data Analysis: Theory and Applications. Wiley.
Scealy, J. and Welsh, A. (2011). Regression for compositional data by using distributions defined on the hypersphere. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(3), 351–375.
Scealy, J. and Welsh, A. (2014). Colours and cocktails: Compositional data analysis 2013 lancaster lecture. Australian & New Zealand Journal of Statistics, 56(2), 145–169.
Stewart, C. and Field, C. (2011). Managing the essential zeros in quantitative fatty acid signature analysis. Journal of Agricultural, Biological, and Environmental Statistics, 16(1), 45–69.
The Austrian Journal of Statistics publish open access articles under the terms of the Creative Commons Attribution (CC BY) License.
The Creative Commons Attribution License (CC-BY) allows users to copy, distribute and transmit an article, adapt the article and make commercial use of the article. The CC BY license permits commercial and non-commercial re-use of an open access article, as long as the author is properly attributed.
Copyright on any research article published by the Austrian Journal of Statistics is retained by the author(s). Authors grant the Austrian Journal of Statistics a license to publish the article and identify itself as the original publisher. Authors also grant any third party the right to use the article freely as long as its original authors, citation details and publisher are identified.
Manuscripts should be unpublished and not be under consideration for publication elsewhere. By submitting an article, the author(s) certify that the article is their original work, that they have the right to submit the article for publication, and that they can grant the above license.