Bayesian Smoothing of Lung Cancer Data in Tirol, Salzburg and Vorarlberg

Disease maps are a valuable tool in descriptive epidemiology. They are helpful in obtaining insight into the geographical distribution of disease occurrence and in identifying clusters of unusual high or low risk. Such maps can provide useful suggestions about the underlying causes of diseases (see, e.g. Zatonski et al., 1996, Pesch et al., 1994). If the disease is rare or the geographical regions are small, the observed incidences


Introduction
Disease maps are a valuable tool in descriptive epidemiology.They are helpful in obtaining insight into the geographical distribution of disease occurrence and in identifying clusters of unusual high or low risk.Such maps can provide useful suggestions about the underlying causes of diseases (see, e.g.Zatonski et al., 1996, Pesch et al., 1994).
If the disease is rare or the geographical regions are small, the observed incidences O i in a region i can be thought to be mutually independent outcomes from a Poisson distribution.
Mapping of maximum-likelihood estimates of relative risk can be misleading due to the fact that the most extreme estimates in such maps are those based on low population.
One of several strategies for dealing with this problem is to use a Bayesian hierarchical model which combines the information contained in the data and prior information on the relative risks, expressed in form of a prior distribution.
Full Bayesian analysis is done via Gibbs sampling, a Markov Chain Monte Carlo technique that can be used to simulate the posterior distribution of the relative risk.

Motivation
In 1998 Oberaigner et al. (1998) have published a disease atlas for western Austria, containing data on cancer incidences and mortality ratios for the municipalities of Salzburg, Tirol and Vorarlberg.For each region i they computed E i , the number of incidences one 22 could expect in region i, if the cancer rate would be the same as throughout the country.The standardised incidence ratio then is where O i is the total number of incidences in region i (Fisher and Van Belle, 1993).
The "raw" SIRs show high variability: they range from 0, when no cancer incidences were observed, to over 30 in a region where only one case occurred.Ratios can be stabilised when using larger geographical units, e.g.districts, but then regions with higher risk will not be identifiable any more, because municipalities with lower rates in the same district will hide these effects.So they decided to map the ratios for the municipalities, but also to produce smoothed maps which are more reliable.To smooth the extreme values, they used a local mean estimator (Bowman and Azzalini, 1997) where neighbouring regions correct the value in region i towards a local mean.
These data were placed to our disposal by Dr. Oberaigner, so that we could try a different approach: a Bayesian approach (see, e.g.Besag et al., 1991).Our aim was to take a simple Bayesian model to estimate relative risks, which lie between the SIRs and a local mean, and to test whether this approach is suitable for a high number of regions where the observations are rare and -as an additional problem -the neighbourhood structure is rather inhomogeneous.The municipalities of Austria do not show a geometrical pattern as, e.g. the departments in France.
The results represented here are part of my doctoral thesis (Koboltschnig, 1998).

The Data
We focused on lung-cancer data for men, which consist of observed incidence counts per municipality for 18 age-groups for the period from 1988 to 1992.
In Figure 1 the Standardised Incidence Ratios (see formula (1)) for each of the 493 municipalities are shown.Due to the fact that lung-cancer is fairly rare, a lot of regions have a SIR of zero; but some municipalities have a high SIR.The highest value (5.69) can be found in one municipality of Tirol: four cases of lung-cancer occurred, but only 0.7 were expected.

The Model
Let i denote the true but unknown relative risk in region i.The posterior distribution of relative risks, given the data, is proportional to the likelihood function of relative risks given the observations, times the prior distribution of the relative risks, ( jO) L(Oj )p( ): (2) We use a conditionally independent Poisson model with conditional expectation for municipality i; i = 1; : : : ; N .Taking the logarithm, this model can be written as log i = log E i + r i ; (4)  Since the O i 's can be thought to be conditionally independent given the risk i we get (5) Our prior belief in the risk is formulated in terms of a prior distribution.We take a normal prior distribution with hyperparameters and 2 for the log-relative risks r i .This prior distribution reflects the assumption of exchangeable log-relative risks where the prior distribution is the same for every region.This distribution can be easily generalised to allow for the possibility of spatial correlation (Clayton and Kaldor, 1987).
If prior knowledge gives reason to assume that adjacent regions tend to have similar risks we can express this in using a normal prior distribution on the log-relative risks where the relative risks have a locally dependent error structure.As noted in Besag et al. (1991) or Mollié (1996), Gaussian intrinsic autoregression, which allows for a nonconstant conditional variance, is appropriate when the number of neighbouring regions is varying throughout the map as is the fact in Western Austria.
In our case where two regions have weight 1 if they are adjacent (= if they share a common boundary) and weight zero else, the normal conditional prior distribution of r i given all other r j and the hyperparameter 2 , has mean E (r i jr i ; 2 ) = r i ; (7) where r i is the mean of the r j in regions adjacent to region i, and r i denotes the log- relative risk in all areas j 6 = i and variance where n i denotes the number of regions adjacent to region i.

Simulating the Posterior Distribution
Because of the complexity, full Bayesian modelling seldom can be done directly, but Markov Chain Monte Carlo simulation can be used to simulate the posterior distribution.
Monte Carlo integration evaluates the expectation of a function g with respect to a density p E p (g) = Z g( )p( )d (9) by drawing samples (t) ; t = 1; : : : ; n, from the posterior distribution p( ) and then approximating the population mean through a sample mean If the samples are independent, the laws of large numbers ensure that this sum can be made as accurate as wanted (Gilks et al., 1996).
Markov Chains can be constructed in various ways.One well known algorithm is the Gibbs sampler, a special case of the single-component Metropolis-Hastings algorithm (Metropolis et al., 1953, Hastings, 1970).The Gibbs sampler requires the full conditional density of a parameter given all other parameters in the model: p( j j 1 ; 2 ; : : : ; j 1 ; j+1 ; : : : ; m ): This can be deduced from the joint posterior distribution p( j j j ) = p( ) R p( j ; j )d j ; (13) where j denotes all parameters except j .
In our model the full conditional density for the log-relative risk in region i can be written as p(r i jr i ; O;2 ) / p(O i jr i )p(r i jr i ; 2 ) (14) (Mollié, 1996), and for the precision = 1= 2 we get p( jr; O) / p(rj )p( ): The Gibbs sampler has the great advantage that good software1 is available.BUGS can be obtained via Internet 2 .BUGS carries out Bayesian inference on problems where the model consists of a defined joint distribution.It is intended for complex models for which conditional independence assumptions can be used, but for which no exact analytic solution exists.BUGS does not only carry out the numerical integrations using simulations, it also provides tools for monitoring and summarising the simulated values.BUGS is distributed as compiled code and is available for various computer platforms (SPARC, DOS, HP, DEC ALPHA, LINUX, . . .).There also exists a version for WINDOWS.WinBUGS has a graphical interface to construct the model, a standard 'point-and-click' windows interface for controlling the analysis as well as graphical tools for monitoring.
When simulating the posterior distribution of interest one has to ensure that convergence has been reached before computing an approximation.Convergence to the stationary distribution can be checked by various diagnostics implemented in CODA3 , which runs under S-Plus4 or R (Ihaka and Gentleman, 1996).

Results
Using this Bayesian model with Gibbs sampling for a map with a high number of regions and few cases can be difficult.In contrast to other models, the prior distribution for the precision = 1 2 in our model has more influence on the convergence.Use of a (0:1; 0:1)-prior for does not influence the resulting posterior distribution for examples with a lower number of regions and more observations, but convergence fails for our data.As an additional problem we have an inhomogeneous neighbourhood structure with varying numbers of neighbours.We tested different parameters for the prior distribution of and found out, that a proper gamma distribution (1; 1) yielded the best results with respect to convergence.
Convergence was then checked in the following way: we computed a pilot chain with 5000 updates and applied Raftery and Lewis (1992)´s convergence diagnostic to get out how many iterations were necessary to reach a required accuracy of estimated quantiles.We re-ran the chain with this number which was about 10000 updates and then checked convergence with Heidelberger and Welch (1983)´s, Gelman and Rubin (1992)´s and Geweke (1992)´s diagnostics which are implemented in CODA.After several trials where convergence was rejected we found out, that the chains for every parameter has to be run for 55000 iterations with a burn-in of 5000 updates and a thinning interval of ten, that means, to report only every tenth update.Updating and monitoring the simulated values took about 90 min.on a Pentium 90 with 96 MB RAM.All of the implemented diagnostics indicated that convergence could be reached for the resulting 5000 iterations per municipality.
Using CODA, we estimated the posterior mean of the i-th SIR as an average over the repeated samples generated by the Gibbs sampler, and used those estimates to produce a smoother map (see Figure 2) of lung-cancer incidences.
We also computed other statistical summaries such as the standard deviation, the posterior median or credible intervals (C.I.).A 95%-C.I. is delimited by the 2.5%-and the 97.5%-quantile, derived from the sampled values.The highest smoothed SIR is 1.720 and   [1988][1989][1990][1991][1992] can be found in Umhausen, Tirol.The posterior median in this municipality (x = 1:677) was also found to be higher than in all the other regions.Looking at the 95%-credible interval, we found out, that for this municipality the lower bound of the credible interval was 1.177, so we can be sure that this region has higher risk to obtain lung cancer than others.This result corresponds with the fact that Umhausen has a higher radon concentration, which may be a reason for increased lung-cancer risk (Wirth, 1994).
As a conclusion we can say that the Bayesian model is a valuable tool for smoothing cancer ratios, but of course both convergence diagnostics and interpretation have to be done carefully.