Comparing Spike and Slab Priors for Bayesian Variable Selection
AbstractAn important task in building regression models is to decide which regressors should be included in the final model. In a Bayesian approach, variable selection can be performed using mixture priors with a spike and a slab component for the effects subject to selection. As the spike is concentrated at zero, variable selection is based on the probability of assigning the corresponding regression effect to the slab component. These posterior inclusion probabilities can be determined by MCMC sampling. In this paper we compare the MCMC implementations for several spike and slab priors with regard to posterior inclusion probabilities and their sampling efficiency for simulated data. Further, we investigate posterior inclusion probabilities analytically for different slabs in two simple settings. Application of variable selection with spike and slab priors is illustrated on a data set of psychiatric patients where the goal is to identify covariates affecting metabolism.
Barbieri, M. M., and Berger, J. O. (2004). Optimal predictive model selection. The Annals of Statistics, 32, 870-897.
Dey, T., Ishwaran, H., and Rao, S. J. (2008). An in-depth look at highest posterior model selection. Econometric Theory, 24, 377-403.
Fernández, C., Ley, E., and Steel, M. F. J. (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100, 381-427.
Gelman, A., Jakulin, A., Pittau, M. G., and Su, Y.-S. (2008). A weakly informative default prior distribution for logistic and other regression models. The Annals of Applied Statistics, 2, 1360-1383.
George, E. I., and McCulloch, R. (1993). Variable selection via Gibbs sampling. Journal of the American Statistical Association, 88, 881-889.
George, E. I., and McCulloch, R. (1997). Approaches for Bayesian variable selection. Statistica Sinica, 7, 339-373.
Geweke, J. (1996). Variable selection and model comparison in regression. In J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. Smith (Eds.), Bayesian Statistics 5 – Proceedings of the fifth Valencia International Meeting (p. 609-620). Oxford University Press.
Geyer, C. (1992). Practical Markov chain Monte Carlo. Statistical Science, 7, 473-511.
Ishwaran, H., and Rao, S. J. (2003). Detecting differentially expressed genes in microarrays using Bayesian model selection. Journal of the American Statistical Association, 98, 438-455.
Ishwaran, H., and Rao, S. J. (2005). Spike and slab variable selection; frequentist and Bayesian strategies. Annals of Statistics, 33, 730-773.
Johnson, V. E., and Rossell, D. (2010). On the use of non-local prior densities in Bayesian hypothesis tests. Journal of the Royal Statistical Society, Series B, 72, 143-170.
Konrath, S., Kneib, T., and Fahrmeir, L. (2008). Bayesian Regularisation in Structured Additive Regression Models for Survival Data (Tech. Rep. No. 35). University of Munich, Department of Statistics.
Malsiner-Walli, G. (2010). Bayesian Variable Selection in Normal Regression Models. Unpublished master’s thesis, Johannes Kepler Universität Linz, Institut für Angewandte Statistik.
Mitchell, T., and Beauchamp, J. J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association, 404, 1023-1032.
O’Hagan, A. (1995). Fractional Bayes factors for model comparison. Journal of the Royal Statistical Society, Series B, 57, 99-118.
Smith, M., and Kohn, R. (1996). Nonparametric regression using Bayesian variable selection. Journal of Econometrics, 75, 317-343.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58, 267-288.
Wagner, H., and Duller, C. (2011). Bayesian model selection for logistic regression models with random intercept. Computational Statistics and Data Analysis. (doi:10.1016/j.csda.2011.06.033)
Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In P. Goel and A. Zellner (Eds.), Bayesian Inference and Decision Techniques (p. 233-243). Elsevier Sccience Publishers.
The Austrian Journal of Statistics publish open access articles under the terms of the Creative Commons Attribution (CC BY) License.
The Creative Commons Attribution License (CC-BY) allows users to copy, distribute and transmit an article, adapt the article and make commercial use of the article. The CC BY license permits commercial and non-commercial re-use of an open access article, as long as the author is properly attributed.
Copyright on any research article published by the Austrian Journal of Statistics is retained by the author(s). Authors grant the Austrian Journal of Statistics a license to publish the article and identify itself as the original publisher. Authors also grant any third party the right to use the article freely as long as its original authors, citation details and publisher are identified.
Manuscripts should be unpublished and not be under consideration for publication elsewhere. By submitting an article, the author(s) certify that the article is their original work, that they have the right to submit the article for publication, and that they can grant the above license.