Skip to main content

Bayesian Poisson regression with global-local shrinkage priors

Primary supervisor

Enes Makalic

Poisson regression is a fundamental tool for modeling count data, appearing ubiquitously in applications ranging from epidemiology (disease counts) and ecology (species abundance) to economics (patent counts) and social sciences (event frequencies). The classical generalized linear model (GLM) framework treats count outcomes as Poisson-distributed with a log-linear relationship to covariates. However, modern datasets often involve high-dimensional predictor spaces where the number of covariates p is large relative to, or even exceeds, the sample size n, presenting significant challenges for traditional estimation methods.

In high-dimensional settings, regularization becomes essential to prevent overfitting and improve prediction. Frequentist approaches like LASSO (ℓ1 penalization) and elastic net have proven effective for variable selection and sparse estimation. However, these methods face limitations including: (i) difficulty in uncertainty quantification, (ii) sensitivity to tuning parameter selection, and (iii) challenges in handling grouped or correlated predictors. Bayesian approaches offer a natural alternative, replacing penalization with prior distributions that encode sparsity assumptions while providing full posterior uncertainty quantification.

Global-local shrinkage priors represent a powerful class of hierarchical Bayesian priors that have revolutionized sparse estimation. These priors—including the horseshoe (Carvalho et al., 2010), the Dirichlet-Laplace (Bhattacharya et al., 2015), the generalized double Pareto (Armagan et al., 2013), and the R2-D2 prior (Zhang et al., 2022)—combine a global shrinkage parameter (controlling overall sparsity) with local parameters (allowing individual coefficients to escape shrinkage when warranted). This hierarchical structure provides near-optimal performance: strong shrinkage toward zero for negligible effects while minimal shrinkage for truly important signals.

The horseshoe prior, in particular, has gained prominence due to its theoretical properties: it concentrates posterior mass near zero for small coefficients while having heavy tails that resist over-shrinking large effects. Unlike LASSO's Laplace prior, which induces bias in large coefficients, the horseshoe's Cauchy-like tails preserve signal strength. Extending these powerful priors to Poisson regression with potentially overdispersed or zero-inflated counts presents both theoretical and computational challenges, including posterior intractability requiring advanced MCMC or variational inference methods.

Aim/outline

Aim 1: Theoretical Development of Global-Local Shrinkage Priors for Poisson GLMs
Derive and analyze posterior properties of global-local shrinkage priors (horseshoe, Dirichlet-Laplace, generalized double Pareto, R2-D2) applied to Poisson regression coefficients. Investigate posterior contraction rates, oracle properties, and selection consistency in high-dimensional settings where p >> n. Extend theoretical results to handle overdispersion (negative binomial regression), zero-inflation, and hierarchical structures common in count data applications.

Aim 2: Computational Methods for Posterior Inference
Develop efficient computational strategies for posterior sampling in high-dimensional Poisson regression with global-local priors. Implement state-of-the-art MCMC algorithms including Hamiltonian Monte Carlo (HMC), No-U-Turn Sampling (NUTS), and specialized Gibbs samplers exploiting Pólya-Gamma data augmentation. Address computational bottlenecks specific to count data GLMs and assess mixing, convergence diagnostics, and scalability to thousands of predictors.

Aim 3: Empirical Applications and Method Comparison
Conduct comprehensive simulation studies evaluating prediction accuracy, variable selection performance (sensitivity, specificity, F1-score), uncertainty quantification, and computational efficiency across diverse scenarios including varying sparsity levels, signal strengths, and sample sizes. Apply methods to real-world datasets from genomics (RNA-seq gene expression counts), epidemiology (disease surveillance), ecology (species counts), and text analysis (word frequencies). Compare against established methods including penalized GLMs (LASSO, elastic net), Bayesian variable selection approaches, and machine learning alternatives.

URLs/references

Global-Local Shrinkage Priors Foundation:

  • Carvalho, C. M., Polson, N. G., & Scott, J. G. (2010). The horseshoe estimator for sparse signals. Biometrika, 97(2), 465-480. [Seminal horseshoe prior paper]
  • Polson, N. G., & Scott, J. G. (2012). On the half-Cauchy prior for a global scale parameter. Bayesian Analysis, 7(4), 887-902.
  • Bhattacharya, A., Pati, D., Pillai, N. S., & Dunson, D. B. (2015). Dirichlet-Laplace priors for optimal soft sparsity with guarantees. Journal of the American Statistical Association, 110(512), 1479-1490.
  • Piironen, J., & Vehtari, A. (2017). Sparsity information and regularization in the horseshoe and other shrinkage priors. Electronic Journal of Statistics, 11(2), 5018-5051. [Excellent overview and comparison]

Advanced Shrinkage Priors:

  • Armagan, A., Dunson, D. B., & Lee, J. (2013). Generalized double Pareto shrinkage. Statistica Sinica, 23(1), 119-143.
  • Bhadra, A., Datta, J., Polson, N. G., & Willard, B. (2019). Lasso meets horseshoe regression. Statistical Science, 34(3), 405-427. [Unifying theoretical perspective]

Poisson Regression and Count Data:

  • Hilbe, J. M. (2014). Modeling Count Data. Cambridge University Press.
  • Gelman, A., & Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press. 

Computational Methods:

  • Polson, N. G., Scott, J. G., & Windle, J. (2013). Bayesian inference for logistic models using Pólya-Gamma latent variables. Journal of the American Statistical Association, 108(504), 1339-1349. [Key for efficient MCMC]
  • Makalic, E., & Schmidt, D. F. (2016). A simple sampler for the horseshoe estimator. IEEE Signal Processing Letters, 23(1), 179-182.
  • Johndrow, J. E., Orenstein, P., & Bhattacharya, A. (2020). Scalable approximate MCMC algorithms for the horseshoe prior. Journal of Machine Learning Research, 21(1), 1-61.