BMS: Frequently Asked Questions
If you have any question that is not answered below, please feel free to ask us directly.
Contents:
- General Questions on the BMS Package:
- Questions on Estimating in BMS:
- Why does bms omit an entire data row if there is a single NA observation in it?
- How can I get the values plotted in density.bma?
- How can I get the actual values for prior and posterior model size distributions (such as in plotModelsize)?
- How can I do Bayesian Model Selection (not averaging) with BMS?
- How can I do Bayesian Model Selection (not averaging) with BMS?
- Questions on Error Messages:
- I cannot install BMS - I just get the message: 'Warning: unable to access index for repository ...'
- I cannot install BMS because I am lacking root privileges - I get a message like: 'is not writeable'
- I get the warning message: 'Argument 'X.data' contains NAs. The corresponding rows have not been taken into account.'?
- What can I do with the following error message: 'The design matrix you provided seems to overly suffer from linear dependence; ...'
- I get the error message: 'Please separate column names of interaction terms by # (e.g. A#B)'
- No matter what I do, I always get 'Error: could not find function ...'
- I want to estimate with categorical (factor) variables, but I always get the message 'Error in colMeans(X) : 'x' must be numeric'
General Questions on the BMS Package:
How fast is BMS?
The answer depends on the specific task to be done. In general, doing 1 million draws with 100,000 burn-ins takes three to five minutes on a modern PC.
Can I use BMS in Matlab?
Yes, if you are a Windows user. BMS toolbox for Matlab provides an interface to access the core functionality of BMS from within Matlab. (It still requires the installation of R to do the computations, though.) Check the BMS for Matlab Tutorial for some screenshots.
What kind of model priors can I use?
There are 5 kinds of model size priors implemented. In addition it is possible to define custom model size priors or custom prior inclusion probabilities. For more information consult the appendix A.1 in Bayesian Model Averaging with BMS.
Can I retain a set of fixed variables that appears in all the models visited?
Yes, via the argument fixed.reg
in function bms
Can I compute predictive densities with BMS?
Yes, through the function pred.density
.
How can I get the source code of the BMS functions (including comments)?
The easiest way is to download the BMS package zip file for older Windows versions. The zip file contains a folder named "R-ex" with function files.
Alternatively, you can download the BMS.RData workspace and load it with the command load
. Then you have BMS functions in your memory that include comments.
Questions on Estimating in BMS:
Why does bms omit an entire data row if there is a single NA observation in it?
Because it is theoretically problematic to do BMA over models of different sample size. BMA rests on the assumption that the probability of the response variable ( p(y) ) is constant over all models. When mixing different y-vectors (with different sample sizes) , then this is no longer the case. Therefore bms
compares only models that all have exactly the same response variable vector y.
How can I get the values plotted in density.bma?
density.bma
actually produces an object of class 'density' that contains this information (cf help(density)
). You could try e.g. data(attitude); library(BMS); att=bms(attitude); foo=density(att,1); plot(foo)
. The object foo
then contains information on density.
How can I get the actual values for prior and posterior model size distributions (such as in plotModelsize)?
The posterior model distribution can be most easily accessed via plotModelsize
: try e.g. foo=plotModelsize(YOURBMA); print(foo)
. (Here YOURBMA
is a bma object resulting from the bms
function)
The prior model size is hidden in the bma object and can be accessed via YOURBMA$mprior.info$mp.Kdist
.
How can I do Bayesian Model Selection (not averaging) with BMS?
Yes, as with nearly any BMA software as Bayesian Model Selection is to take the model with highest posterior model probability from the results. In BMS, there is particular functionality - consider the following commands:
data(datafls); mfls=bms(datafls); # does model sampling with FLS data print(mfls$topmod[1]) # shows the included variables in the best model (No. 1) zfls=as.zlm(mm,1) # extracts the best model from the BMA results summary(zfls) # shows some information about the model
With the help on help(as.zlm)
and help(zlm)
you will find some information on how to extract further data. Note that you can also plot posterior coefficient densities for single (zlm) models with density(zfls)
.
How can I do Bayesian Model Selection (not averaging) with BMS?
Questions on Error Messages:
I cannot install BMS - I just get the message: 'Warning: unable to access index for repository ...'
You probably run an older R version under Windows - try manual installation as described here
I cannot install BMS because I am lacking root privileges - I get a message like: 'is not writeable'
If you do not have the rights to install into BMS to the general library path, try instead to create a personal library. Look for a directory where you have write permission (for demonstration, let it be /u/YOU/lib
) and type the following:
install.packages("BMS",lib="/u/YOU/lib")
In order to load the package you have augment the library(BMS)
command as follows
library("BMS",lib.loc="/u/YOU/lib")
In case that does not work either, resort to the BMS.RData that provides the same functionality, but without the conveniences of a package.
I get the warning message: 'Argument 'X.data' contains NAs. The corresponding rows have not been taken into account.'?
The function bms
omits all data rows that contain at least one NA
(not available), because it raises theoretical problems to do BMA over models with different number of observations.
What can I do with the following error message: 'The design matrix you provided seems to overly suffer from linear dependence; ...'
You data seems to be collinear, or to include a constant. Check first for the following: Did you accidentally include a constant term in your data? If you use dummies, may it be that a linear combination of all your dummies adds up to a constant?
If the problem still persists, look at the eigenvalues of your correlation matrix with cor(eigen(YOURDATA)$values)
and check whether some eigenvalues a very close to zero - and whether you can get rid of them by omitting a specific variable.
I get the error message: 'Please separate column names of interaction terms by # (e.g. A#B)'
This happens when you provide the function bms
with the argument mcmc="bd.int"
or mcmc="rev.jump.int"
, which means you intend to use an interaction sampler. The interaction sampler only includes interaction terms in a model when all of its base variables are as well included.
Therefore, you need to designate your interaction terms and specify which variables are included in it: For instance, if your data frame has two variables called 'A' and 'B' and an interaction of the two, then name the interaction term 'A#B' in your data frame (e.g. if your interaction term is the 4th column of your data, then do this via names(YOURDATA)[4] <- "A#B"
.
No matter what I do, I always get 'Error: could not find function ...'
Check whether you accidentally forgot to execute library(BMS)
prior to executing your command.
I want to estimate with categorical (factor) variables, but I always get the message 'Error in colMeans(X) : 'x' must be numeric'
BMS is not built to work with factor class. An easy work-around is to convert your factor (i.e. categorical) data into their underlying numerical values before estimation, as in the following example:
data(iris); #load some data
bms(iris) #does not work
bms(as.data.frame(lapply(iris,as.numeric))) #does work.