The horseshoe estimator for sparse signals

@article{Carvalho2010TheHE,
  title={The horseshoe estimator for sparse signals},
  author={Carlos M. Carvalho and Nicholas G. Polson and James G. Scott},
  journal={Biometrika},
  year={2010},
  volume={97},
  pages={465-480},
  url={https://api.semanticscholar.org/CorpusID:378688}
}
This paper proposes a new approach to sparsity, called the horseshoe estimator, which arises from a prior based on multivariate-normal scale mixtures. We describe the estimator's advantages over existing approaches, including its robustness, adaptivity to different sparsity patterns and analytical tractability. We prove two theorems: one that characterizes the horseshoe estimator's tail robustness and the other that demonstrates a super-efficient rate of convergence to the correct estimate of… 

Figures and Tables from this paper

Adaptive posterior contraction rates for the horseshoe

It is proved that the MMLE is an effective estimator of the sparsity level, in the sense that it leads to (near) minimax optimal estimation of the underlying mean vector generating the data.

Handling Sparsity via the Horseshoe

This paper presents a general, fully Bayesian framework for sparse supervised-learning problems based on the horseshoe prior, which is a member of the family of multivariate scale mixtures of normals and closely related to widely used approaches for sparse Bayesian learning.

The Horseshoe+ Estimator of Ultra-Sparse Signals

It is proved that the horseshoe+ posterior concentrates at a rate faster than that of the horsShoe in the Kullback-Leibler (K-L) sense, and theoretically that the proposed estimator has lower posterior mean squared error in estimating signals compared to the horsshoe and achieves the optimal Bayes risk in testing up to a constant.

Sparse Horseshoe Estimation via Expectation-Maximisation

This work proposes a novel expectation-maximisation (EM) procedure for computing the MAP estimates of the parameters in the case of the standard linear model and introduces several simple modifications of this EM procedure that allow for straightforward extension to generalised linear models.

A Laplace Mixture Representation of the Horseshoe and Some Implications

The horseshoe prior, defined as a half Cauchy scale mixture of normal, provides a state of the art approach to Bayesian sparse signal recovery. We provide a new representation of the horseshoe…

The Horseshoe Estimator: Posterior Concentration around Nearly Black Vectors

We consider the horseshoe estimator due to Carvalho, Polson and Scott (2010) for the multivariate normal mean model in the situation that the mean vector is sparse in the nearly black sense. We…

Sparse Estimation with Generalized Beta Mixture and the Horseshoe Prior

Experimental results show that the proposed GBM and Horseshoe distributions outperform state-of-the-art methods on a wide range of sparsity levels and amplitudes in terms of reconstruction accuracy, convergence rate and sparsity.

Bayesian Robust Regression with the Horseshoe+ Estimator

The first efficient Gibbs sampling algorithm for the horseshoe\(+\) estimator for linear and logistic regression models is developed, which represents the state-of-the-art in Bayesian machine learning techniques.

The Graphical Horseshoe Estimator for Inverse Covariance Matrices

The proposed graphical horseshoe estimator has attractive properties compared to other popular estimators, such as the graphical lasso and the graphical smoothly clipped absolute deviation and provides estimates with small information divergence from the sampling model when the true inverse covariance matrix is sparse.
...

Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences

An empirical Bayes approach to the estimation of possibly sparse sequences observed in Gaussian white noise is set out and investigated. The prior considered is a mixture of an atom of probability at…

Bayesian prediction with adaptive ridge estimators

This work proposes a simpler way to account for model uncertainty that is based on generalized ridge regression estimators and demonstrates how to efficiently mix over different sets of basis functions, letting the data determine which are most appropriate for the problem at hand.

Estimation with Quadratic Loss

It has long been customary to measure the adequacy of an estimator by the smallness of its mean squared error. The least squares estimators were studied by Gauss and by other authors later in the…

Information-theoretic asymptotics of Bayes methods

The authors examine the relative entropy distance D/sub n/ between the true density and the Bayesian density and show that the asymptotic distance is (d/2)(log n)+c, where d is the dimension of the parameter vector.

INADMISSIBILITY OF THE USUAL ESTIMATOR FOR THE MEAN OF A MULTIVARIATE NORMAL DISTRIBUTION

If one observes the real random variables Xi, X,, independently normally distributed with unknown means ti, *, {n and variance 1, it is customary to estimate (i by Xi. If the loss is the sum of…

Objective Bayesian model selection in Gaussian graphical models

These studies reveal that the combined use of a multiplicity-correction prior on graphs and fractional Bayes factors for computing marginal likelihoods yields better performance than existing Bayesian methods.

Inference for nonconjugate Bayesian Models using the Gibbs sampler

The Gibbs sampler technique is proposed as a mechanism for implementing a conceptually and computationally simple solution in such a framework and the result is a general strategy for obtaining marginal posterior densities under changing specification of the model error densities and related prior densities.

The Bayesian Lasso

The Lasso estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate when the regression parameters have independent Laplace (i.e., double-exponential) priors.…

Sparse bayesian learning and the relevance vector machine

It is demonstrated that by exploiting a probabilistic Bayesian learning framework, the 'relevance vector machine' (RVM) can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages.
...