Exploring Dirichlet mixture and logistic Gaussian process priors in density estimation, regression and sufficient dimension reduction
We establish that the Dirichlet location scale mixture of normal priors and the logistic Gaussian process priors can be used to build computationally feasible and asymptotically consistent nonparametric Bayesian procedures for density estimation, regression and sufficient dimension reduction in regression with many covariates.
We extend the known results on consistency properties of the Dirichlet mixture priors in density estimation to the case of location scale mixtures - the most commonly used prior distributions for nonparametric density estimation. We show that, with a standard choice of the hyperparameters, posterior consistency obtains at a large class of densities satisfying a simple, fractional moment condition. This class contains many heavy tailed densities, e.g., the Cauchy densities. For linear regression with an unknown error distribution, we use a symmetrized version of a Dirichlet location scale mixture of normal prior and prove that posterior consistency obtains at error distributions which admit a density that satisfies a mild moment condition.
We also initiate a study of posterior consistency properties of logistic Gaussian process priors in the context of density estimation. For density estimation on a known bounded interval, we show that many logistic Gaussian process priors give rise to posterior distributions which are consistent at any continuous density function. For density estimation on an unbounded interval, we make use of semiparametric procedure by embedding a parametric family of densities as the center of a logistic Gaussian process. We show that the resulting procedure is consistent at any continuous density whose tails decay faster than the tails of some members of the embedded parametric family.
A major part of our study concentrates on developing a new Markov Chain Monte Carlo based computing method to draw posterior inference in density estimation with a logistic Gaussian process prior. This method gains speed by drawing samples from the posterior of a finite dimensional surrogate prior, which is obtained by imputation of the underlying Gaussian process. We establish that imputation results in quite accurate computation. Simulation studies show that accuracy and high speed can be combined.
Our study of consistency and computation of logistic Gaussian process priors leads us to a new application of these priors in sufficient dimension reduction in regression with many covariates. We develop a semiparametric Bayesian procedure to detect a minimal number of linear combinations of the covariates which contain all the information about the response and simultaneously determine the conditional density of the response given these linear combinations. We show that all of the important linear combinations can be exhaustively retrieved (asymptotically) under mild conditions on the covariates. Simulation study indicates that our method often makes a more accurate detection of the linear combinations than the existing methods. We believe that the better performance of our method is due to its ability to simultaneously deal with the finite dimensional linear combinations and the infinite dimensional conditional density.