Navigation

##### Personal tools
You are here: Home MGRF: simple CAR model

# GMRF: CAR model for the Scottish Lip Cancer Data

Example of conditionally autoregressive (CAR) model applied to 56 geographical Scottish regions.

If you google "Scottish Lip Cancer Data" you will find that these data are used in many textbooks on spatial modelling, also as examples for statistical software (e.g. Winbugs). We use the notation from http://www.biostat.umn.edu/~ph7440/pubh7440/slideLips.pdf

The main ingredients of the model are:

1. Counts are Poisson distributed
2. A spatially smooth term (phi) affects the expected number of counts. This is the CAR part of the model. "phi" for one particular region is normally distributed (mean,sigma^2/m), where mean is the average of the neighbouring phi's, and m is the number of neighbour for this region. The specification of such a distribution for each region collectively constitute the CAR prior.
3. There is a heterogeneity effect  theta  associate with each region, assumed to be independent between regions.

## Specifying the CAR prior

In this example the neighbourhood structure is specified in a separate file (scotlandgraph.dat

). The first part of this file looks as:

```1 3 5 9 19
2 2 7 10
3 1 12
4 3 18 20 28
5 3 1 12 19
6```

The first line means: for region "1", the neighbours are: 3,5,9,19. Region "6" (an island) has no neighbours, and no CAR prior can be assigned to it.

In the tpl file the rugged array W holds the neighboorhood information. In a loop over all regions, the CAR prior for the i'th region is calculated by the call

`car_prior(phi(W(i)),i)`

which invokes the code

```  dvariable mean = sum(phi(1,m(i)))/m(i);
g -= -0.5*square(phi(0)-mean)*m(i);```

where "mean" is the average of the neighbouring values. The variance is set to be 1/m (m=number of neighboors),  which phi is scaled by "sigma" in the Poisson likelihood.

## Running the program

CAR models should be run with the -shess" command line option. For this small example the run time is anyway small, but for large datasets this is crucial.