In the context of Gaussian linear models, the package proposes a solution of the variable selection problem using Shrinkage priors. First obtain a posterior distribution of the number of significant variables by clustering the significant variables and the noise coefficients and obtaining samples from this distribution using MCMC. Then, the significant variables are estimated from the posterior median of absolute beta. This approach requires only one tuning parameters and is applicable to continuous shrinkage priors as well.
These instructions will install the package to your computer.
In order to install the package directly from Github, you need to have the devtools package:
install.packages("devtools")
To install the package from Github, first load devtools:
library(devtools)
Next, install VsusP as follows:
::install_github("nilson01/VsusP") devtools
The package can now be loaded into R and used:
library(VsusP)
To provide an example of the VsusP package, we will first simulate some data:
# install.packages("MASS") # if not yet installed
library(MASS)
set.seed(20221208)
sim.XY <- function(n, p, beta) {
X <- matrix(rnorm(n * p), n, p)
Y <- as.vector(X %*% beta + rnorm(n, 0, 1))
return(list(X = X, Y = Y, beta = beta))
}
n <- 100
p <- 20
beta <- exp(rnorm(p))
data <- sim.XY(n, p, beta)
The function Sequential2Means
is used get the samples of
Beta
and H.b.i
. Beta
can be
calculated using different shrinkage priors and gibbs sampling
technique, whereas H.b.i is the estimated number of signals
corresponding to each tuning parameter b.i
that is computed
using S2M algorithm. If the Beta
matrix is available prior,
then H.b.i is recommended to be computed using
Sequential2MeansBeta
.
b.i <- seq(0, 1, 0.05)
S2M <- Sequential2Means(X = data$X, Y = data$Y, b.i = b.i, prior = "horseshoe+", n.samples = 5000, burnin = 2000)
Beta <- S2M$Beta
H.b.i <- S2M$H.b.i
Then, using the result from Sequential2Means
or
Sequential2MeansBeta
, appropriate tuning parameter can be
estimated using b.i Vs H.b.i plot from OptimalHbi
function.
OptimalHbi(bi = b.i, Hbi = H.b.i)
Finally, the indices of important subset of variables can be obtained
using S2MVarSelection
function. The input parameter
H
is the optimal H.b.i value from the selected tuning
parameter in OptimalHbi
function. Also, Beta
is the user defined matrix or the matrix derived from
Sequential2Means
function.
H <- 17
impVariablesGLM <- S2MVarSelection(Beta, H)
impVariablesGLM
There are several other variants of these functions as per input
cases. For example: Sequential2MeansBeta and S2MVarSelectionV1. Check
the documentation for the different functions on options and details,
using e.g., ?Sequential2Means
.
R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Li, H., & Pati, D. (2020) “Variable Selection Using Shrinkage Priors.” Computational Statistics & Data Analysis, 107, pp.107-119. doi:10.1016/j.csda.2020.106839.