Kernel Density Estimation: Theory and Application in Discriminant Analysis

Authors

  • Thomas Ledl Department of Statistics and Decision Support Systems, University of Vienna, Austria

DOI:

https://doi.org/10.17713/ajs.v33i3.441

Abstract

Nowadays, one can find a huge set of methods to estimate the density function of a random variable nonparametrically. Since the first version of the most elementary nonparametric density estimator (the histogram) researchers produced a vast amount of ideas especially corresponding to the issue of choosing the bandwidth parameter in a kernel density estimator model. To focus not only on a descriptive application, the model seems to be quite suitable for application in discriminant analysis, where (multivariate) class densities are the basis for the assignment of a vector to a given class. This
article gives insight to most popular bandwidth parameter selectors as well as to the performance of the kernel density estimator as a classification method compared to the classical linear and quadratic discriminant analysis, respectively. Both a direct estimation in a multivariate space as well as an application of the concept to marginal normalizations of the single variables will be taken into consideration. From this report the gap between theory and application is going to be pointed out.

References

Bowman, A. W. (1984). An alternative method of cross-validation for the smoothing of density estimates. Biometrika, 71, 353-360.

Breiman, L., Meisel,W., and Purcell, E. (1977). Variable kernel estimates of multivariate densities. Technometrics, 19, 135-144.

Cao, R., Cuevas, A., and Manteiga,W. (1994). A comparative study of several smoothing methods in density estimation. Computational Statistics and Data Analalysis, 17, 153-176.

Devroye, L., and Györfi, L. (1985). Nonparametric Density Estimation: The L1 View. New York: John Wiley.

Habbema, J. D. F., Hermans, J., and Remme, J. (1978). Variable kernel density estimation in discriminant analysis. In Proceedings in Computational Statistics (p. 178-185). Physica Verlag Wien.

Hall, P., and Wand, M. P. (1988). The plug-in bandwidth selection. Biometrika, 75, 541-547.

Hand, D. J. (1997). Construction and Assessment of Classification Rules. Chicester: John Wiley & Sons.

Jones, M. C. (1991). The roles of ISE and MISE in density estimation. Statistical Probability Letters, 12, 51-56.

Jones, M. C., Marron, J. S., and Sheather, S. J. (1996). A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association, 91, 401-407.

Marron, J. S., and Tsybakov, A. B. (1995). Visual error criteria for qualitative smoothing. Journal of the American Statistical Association, 90, 499-507.

Marron, J. S., and Wand, M. P. (1992). Exact mean integrated squared error. The Annals of Statistics, 20, 712-736.

Ness, J. V. (1980). On the dominance of non-parametric Bayes rule discriminant algorithms in high dimensions. Pattern Recognition, 12, 355-368.

Ness, J. W. V., and Simpson, C. (1976). On the effects of dimension in discriminant analysis. Technometrics, 18, 175-187.

Park, B. U., and Marron, J. S. (1990). Comparison of data-driven bandwidth selectors. Journal of the American Statistical Association, 85, 66-72.

Park, B. U., and Turlach, B. (1992). Practical performance of several data-driven bandwidth selectors (with discussion). Computational Statistics, 7, 251-285.

Remme, J., Habbema, J. D. F., and Hermans, J. (1980). A simulative comparison of linear, quadratic and kernel discrimination. Journal of Statistical Computation and Simulation, 11, 87-106.

Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press.

Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. Annals of Mathematical Statistics, 27, 832-837.

Ruppert, D., and Cline, D. B. H. (1994). Transformation kernel density estimation – bias reduction by empirical transformations. Annals of Statistics, 22, 185-210.

Scott, D.W. (1992). Multivariate Density Estimation: Theory, Practice and Visualization. New York: Wiley.

Sheather, S. J. (1992). The performance of six popular bandwidth selection methods on some real datasets (with discussion). Computational Statistics, 7, 225-281.

Sheather, S. J., and Jones, M. C. (1991). A reliable data-based bandwidth-selection method for kernel density estimation. Journal of the Royal Statistical Society, Series B, 53, 683-690.

Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. London: Chapman and Hall.

Wand, M. P., and Jones, M. C. (1994). Multivariate plug-in bandwidth selection. Computational Statistics, 9, 97-117.

Wand, M. P., and Jones, M. C. (1995). Kernel Smoothing. London: Chapman and Hall.

Downloads

Published

2016-04-03

Issue

Section

Articles

How to Cite

Kernel Density Estimation: Theory and Application in Discriminant Analysis. (2016). Austrian Journal of Statistics, 33(3), 267-279. https://doi.org/10.17713/ajs.v33i3.441