%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% HEADER
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentclass[12pt]{article}
\usepackage[T1]{fontenc}
\usepackage[latin1]{inputenc}
\usepackage[pdftex]{graphicx} %%graphics in pdfLaTeX
\usepackage{amsmath,rotating,longtable,url}
\usepackage{fancyhdr}
\setlength{\topmargin}{-1.5cm} \setlength{\textwidth}{16.5cm}
\setlength{\textheight}{23.5cm} \setlength{\oddsidemargin}{0cm}
\setlength{\evensidemargin}{0cm} \setlength{\parindent}{0cm}
\setlength{\parskip}{0cm}
\newcommand{\ep}{\hfill\hfill$\Box$}
\newcommand{\p}{\overline{p}}
\newcommand{\s}{\sigma^{2}}
\newcommand{\W}{$\textbf{W}$}
\renewcommand{\arraystretch}{1}
\renewcommand{\baselinestretch}{1}
\usepackage{C:/Programme/TEMP/Sweave}
\begin{document}
\title{Bayesian Model Averaging (BMA) with uncertain Spatial Effects\\[0.5cm]
A Tutorial \\[3cm]
Martin Feldkircher}
\date{This version: October 2010}
\maketitle
<>=
options(width=75)
@
This file illustrates the computer code to use spatial filtering in the context of Bayesian Model Averaging (BMA).
For more details and in case you use the code please cite Crespo Cuaresma and Feldkircher (2010).
To get the code started you need to install the R-packages \verb+spdep+, \verb+BMS+ (Version $\geq 0.2.5$) and the add on package \verb+spatBMS+.
For an introduction to \verb+BMS+ see \url{http://bms.zeugner.eu} and the tutorials therein.
The following file has been tested with \verb+R.2.11+.
\section{A Primer to Spatial Filtering and BMA}
Consider a cross-sectional regression of the following form:
\begin{equation}
y=\alpha \iota_{N} + \rho \textbf{W} y + \textbf{X}_k \vec{\chi}_k + \sigma \varepsilon
\label{equ: sar}
\end{equation}
where $y$ is an $N$-dimensional column vector of the dependent variable, $\alpha$ is the intercept term, $\iota_{N}$ is an $N$-dimensional
column vector of ones, $\textbf{X}_k = ( \textbf{x}_1 \ldots \textbf{x}_k )$ is a matrix whose columns are stacked data for $k$
explanatory variables and $\vec{\chi}_k = ( \chi_1 \ldots \chi_k )' $ is the $k$-dimensional parameter vector corresponding to
the variables in $\textbf{X}_k$. The spatial autocorrelation structure is specified via a spatial weight matrix $\textbf{W}$. The coefficient $\rho$ attached to $\textbf{W}$ reflects the degree of spatial autocorrelation.
Equation (\ref{equ: sar}) is in the vein of a parametric spatial model where the spatial parameter $\rho$ is often interpreted
as a spillover parameter.
In this setting, on top of the uncertainty regarding the choice of explanatory variables an extra degree of uncertainty arises:
we do \textit{not} know the actual nature of the spatial interactions which we model through the spatial autoregressive term in equation (\ref{equ: sar}),
that is, if we conduct inference conditional on $\textbf{W}$. Spatial autocorrelation will be observable whenever the phenomenon under study is
a spatial process or omitted variables cause spatial variation in the residuals (Tiefelsdorf and Griffith, 2007). Note that both
arguments typically apply to economic cross-section data, where economic units interact with each other
and omitted variables decrease the level of confidence in econometric analysis. Since inference from the SAR model is conditional on the weight matrix $\textbf{W}$, which has to be exogenously specified,
explicitly accounting for this source of model uncertainty is a natural generalization to uncertainty in the nature of \textbf{X}$_k$ in the framework of BMA. In most applications there is little theoretical guidance on
which structure to put on the weight matrix rendering its specification a serious challenge. \\
\subsection{Spatial Filtering}
The spatial filtering literature seeks to remove residual spatial autocorrelation patterns prior to estimation and is in principle
not interested in directly estimating $\rho$ in (\ref{equ: sar}). The approach put forward by Getis and Griffith (2002) and
Tiefelsdorf and Griffith (2007), is based on an eigenvector decomposition of a transformed $\textbf{W}$ matrix, where the transformation
depends on the underlying spatial model. The eigenvectors $\{e_i\}$ are included as additional explanatory variables and the regression equation
(1) becomes:\\
\begin{equation}
y=\alpha \iota_{N} +\sum_{i=1}^{{E}}\gamma_i\vec{e}_i + \textbf{X}_k \vec{\chi}_k + \sigma \varepsilon,
\label{equ: spatFilt}
\end{equation}
where each eigenvector $\vec{e}_i$ spans one of the spatial dimensions. By introducing the eigenvectors into the regression,
we explicitly take care of (remaining) spatial patterns in the residuals. Furthermore spatial commonalities among
the covariates in $\textbf{X}_k$ are conditioned out. This reduces the degree of multicollinearity and further separates spatial effects
from the 'intrinsic' impact the employed regressors exert on the dependent variable.\\
\subsection{BMA with uncertain spatial effects}
From a Bayesian perspective, the problem of obtaining estimates of the parameter associated with a covariate under uncertainty in both the nature of \textbf{W} and \textbf{X}$_k$
can be handled in a straightforward manner. Let us assume that we are interested in a particular regression coefficient, $\beta$.
Denote the set of potential models by $\mathcal{M}=\{M^{1}_{1}, M^{1}_{2},\ldots, M^{1}_{2^K}, \ldots M^{2}_{1},$
$\ldots, M^{2}_{2^K},\ldots, M^{Z}_{1},\ldots, M^{Z}_{2^K} \}$,
where $K$ stands for the number of potential explanatory variables and $Z$ the number of candidate spatial weighting matrices
$\textbf{W}_z$, $z=1,\ldots, Z$ each with associated set of eigenvectors $E_z$. The cardinality of $\mathcal{M}$ is therefore $2^K \times Z$.
A particular model, say $M^{z}_{k}$, is characterized by its parameter vector $\theta^{z}_{k}=(\alpha, \hspace{0.1cm} \chi_k, \gamma_z)$
corresponding to the intercept term included in all models, the coefficients on the regressors entering the model and the coefficients on the set of
eigenvectors $E_z$ related to $\textbf{W}_z$.
In the BMA framework, the posterior distribution of $\beta$ takes now the form of
\begin{equation}
p(\beta|y)=\sum_{j=1}^{2^K} \sum_{z=1}^{Z}p(\beta|M^z_{j}, y)p(M^z_{j}| y)
\label{post}
\end{equation}
with $y$ denoting the data and $\beta$ the coefficient of interest.
Inference on $\beta$ is based on single inferences under models $j=1,\ldots,2^{K}\times Z$ weighted by their respective posterior model probabilities,
$p(M_{j}^{z}| y)$, which in turn depend on the corresponding matrix of spatial weights. For more technical details please see Crespo Cuaresma and Feldkircher (2010).
\subsection{An Illustration: The Boston Housing Data}
In \verb+R+ we will do spatial filtering with the 'Boston housing data' which has been originally published in Harrison and Rubinfeld (1978). The dependent variable (\verb+CMEDV+) contains the
corrected median value of owner-occupied homes in USD 1000's for 506 observations. Among the explanatory variables we have per capita crime (\verb+CRIM+), the pupil-teacher ratio by town
(\verb+PTRATIO+), the proportion of owner-occupied units built prior to 1940 (\verb+AGE+), the proportion of non-retail business acres per town (\verb+INDUS+), a variable that is proportional to
the share of Afro Americans per town (\verb+B+) and a variable reflecting the nitric oxides concentration (\verb+NOX+) among others. For more details please see \verb+?dataBoston+. We start
with loading the data in \verb+R+ :
<