Forecasting of Daily PM10 Concentrations in Brno and Graz by Different Regression Approaches

Brno and Graz, the second largest cities of their countries, observe in each winter season PM10 concentrations of daily means which regularly exceed the limit value of 50 μg/m. This is mainly caused by unfavorable dissemination conditions of the ambient air. Hence, partial regulation measures have to be taken in Brno and Graz where specific decisions for certain regulations may be based on the average PM10 concentration of the next day provided that reliable forecasts of these values are available. For several sites in the two cities we establish forecasts of daily PM10 concentrations based on multiple linear regression and generalized linear models utilizing both measured covariates of the present day and meteorological forecasts of the next day. The comparisons, based on different quality measures demonstrate the usefulness of both model approaches as they yield results of similar quality. Our prediction models may support future decisions concerning possible traffic restrictions or other regulations. Zusammenfassung: Brünn und Graz, die jeweils zweitgrößten Städte ihrer Länder, weisen in jeder Wintersaison PM10-Konzentrationen von Tagesmittelwerten auf, die regelmäßig den Grenzwert von 50 μg/m überschreiten. Dies ist vor allem begründet durch schlechte Durchlüftungsbedingungen der Umgebungsluft. Daher sind partielle Steuerungsmechanismen in Brünn und Graz notwendig, wo spezifische Maßnahmen getroffen werden sollen, die den Tagesmittelwert der PM10-Konzentration des nächsten Tages verwenden könnten, falls zuverlässige Prognosen für diese Werte verfügbar sind. Für verschiedene Messstationen der beiden Städte werden Prognosewerte von täglichen PM10-Konzentrationen hergeleitet, die auf multipler Regression und verallgemeinerten linearen Modellen basieren, welche am laufenden Tag gemessene Kovariable sowie meteorologische Vorhersagen des nächsten Tages verwenden. Die Vergleiche, basierend auf unterschiedlichen Qualitätsmaßen, demonstrieren die Brauchbarkeit beider Modellansätze, welche Resultate ähnlicher Qualität liefern. Unsere Vorhersagemodelle könnten zukünftige Entscheidungen bezüglich möglicher Verkehrsbeschränkungen oder anderer Maßnahmen unterstützen.


Introduction
Particularly during the winter season, the basin area of Graz is exposed to weather conditions such as stationary temperature inversions, low wind velocities and rare precipitation events.These special weather conditions cause an extensive load of particulate matter (PM) in ambient air.PM/fine dust has become a local and regional issue of agglomeration centers, large cities with heavy traffic and industrial areas.The PM10 (particles with an aerodynamic diameter < 10µm) concentration is measured in units of µg/m 3 .According to regulations established by the European Union (EU) in the EC Directive ( 2008), the limit value for the daily PM10 average at the stations which show the highest concentration of pollution is 50 µg/m 3 and must not be exceeded on more than 35 days of the calendar year (valid since January 1, 2005).Also since 2005, the annual PM10 average must not exceed the limit of 40 µg/m 3 .The stricter rules in Austria allowed only 30 exceedances (2005 to 2009) and reduced this even to 25 exceedances (2010).However, at the test point Graz-Mitte (near the pedestrian zone in the center of the city) we observed merely in the eight cold periods (October to March) 2002March) /2003March) -2009March) /2010March) between 36 (2008March) /2009) ) and 105 (2005/2006) exceedances of the daily PM10 limit and registered high annual averages between 41 µg/m 3 and 49 µg/m 3 .
Rules in the Czech Republic are in accordance with the EU regulations.The daily PM10 average at the less exposed station Arboretum in Brno exceeded the limit value in 21 days in the cold period 2007/10 -2008/03 and in 2 days in the cold period 2008/10 -2009/03.In contrast, the stations Židenice and Zvonařka (both close to traffic spots) exhibited a much higher number of exceedances from 55 days to 73 days in the cold period 2007/11 -2009/03.The annual PM10 average at the Brno stations varied from 27 µg/m 3 to 45 µg/m 3 in 2006, from 17 µg/m 3 to 36 µg/m 3 in 2007, from 19 µg/m 3 to 44 µg/m 3 in 2008 and from 18 µg/m 3 to 41 µg/m 3 in 2009.
It is well known that PM may cause adverse health effects which have been analyzed in many epidemiological and toxicological studies.An extensive general review can be found in Pope III and Dockery (2006).A specific Austrian study AUPHEP is described in Hauck et al. (2004), while Schwarze et al. (2006) is a thorough review with 210 references of studies from different regions throughout the world.
Due to adverse health effects caused by PM10 and in order to fulfill the EU regulations, policy has to react (and take drastic measures) against the PM problem.For the authorities it may be necessary to base singular decisions on reliable forecasting models for daily PM10 concentrations.Our aim is to deliver reliable PM10 forecasting models for specific sites and to demonstrate their practical applicability.Our models have been developed for the cold season (October to March), being the period with particularly elevated PM10 concentrations at the observed sites.There are considerably lower PM10 concentrations during the warm season and hence, there is no urgent need for action in that season of the year.Models for Graz based on multiple linear regression were already presented in a previous study (Stadlober, Hörmann, and Pfeiler, 2008).In contrast to many other studies it was possible to show the performance of these models in operational mode based on a threeyear trial period.In Brno, generalized linear models have been used to analyze and predict PM10 levels by means of meteorological and seasonal variables (Hrdličková, Michálek, Kolář, andVeselý, 2008, Veselý, Tonner, Hrdličková, Michálek, andKolář, 2009).Other typical approaches for PM prediction are neuronal networks (c.f.Pérez and Reyes, 2002, Hooyberghs, Mensink, Dumont, Fierens, and Brasseur, 2005, Papanastasiou, Melas, and Kioutsioukis, 2007), discriminant analysis (c.f.Silva, Pérez, and Trier, 2001), multi-gene genetic programming (Pires, Alvim-Ferraz, Pereira, and Martins, 2010) or Kalman filtering (c.f.van der Wal and Jansen, 1999).Hybrid ARIMA and artificial neural network models have been applied to improve forecasting of extreme events (Díaz-Robles et al., 2008).Sfetsos and Vlachogiannis (2010) complements the three different approaches linear regression, nearest neighbors and artificial networks by adding hourly PM10 values of the past 48 hours to the models.
In the next section, we describe the situation in Graz and Brno according to sites, data bases and covariates (input parameters).The multiple regression methodology (LM) and the generalized linear models (GLM) approach of our study are discussed in Section 3. Results of model comparisons and of test runs in operational mode are analyzed in Section 4. In the final Section 5 we conclude our findings and reveal our recommendations.
2 Situation in Graz and in Brno

Sites and database in Graz
Our analysis in Graz is based on data from two monitoring stations, the site Graz-Mitte in a traffic area near to the pedestrian zone in the city center and the site Graz-Süd near to a commercial and industrial area in the south.In Graz-Mitte we collected data of eight cold seasons (2002/2003 -2009/2010, 1348 observations and 55 missing values), in Graz-Süd we had access to data of seven cold seasons (2003/2004 -2009/2010, 1094 observations and 48 missing values).Graz is located in a basin area south of the main Alpine crest leading to low precipitation and low wind velocities during the cold period.Additionally, stationary temperature inversions frequently occur at that time of year due to the basin location.Some core data characteristics, listed in Table 1, illustrate the situation at Graz-Mitte.Similar results were obtained for Graz-Süd.
Graz, the capital of Styria, has approximately 263 k residents and is located in the northern part of the Graz basin at 350 m above sea level.The tree-covered hills bordering at the north are about 400 m higher.Our two measuring stations are part of a permanent PM10 monitoring network in Graz with five test points three of which are located in traffic areas and two in residential areas.Additionally to PM10, the test points provide other pollution and meteorological data on the basis of half hour averages.From the test point Kalkleiten which is situated close to Graz 710 m above sea level we obtained temperature data to measure the effect of temperature inversion.Figure 1 shows the location of the stations on the map of Graz.
Subsequently, we introduce the input variables suitable for our multiple regression and generalized linear models.The focus is on forecasting daily averages of PM10 for day t in the cold season, i.e. the period from October 1 to March 31:     • Ap t : average of 48 1/2h-measurements of PM10 on day t (0:30 to 24:00).
Temperature inversion has the most significant impact on Ap t , the PM10 concentration in Graz (see also Hörmann, Pfeiler, and Stadlober, 2005).In order to measure temperature inversion for Graz we define • ∆T t : average temperature difference to reference test point (Kalkleiten) on day t (at Graz-Mitte for Graz-Süd).
If the daily average ∆T t is negative, we refer to this as an inversion day.In the cold period, the daily average of ∆T t is negative to a level of 13-47 % depending on specific temperature conditions in the winter season.Obviously, the emergence of wind and precipitation leads to reduced PM10 concentrations.Thus, we include the two parameters • V t : average wind speed on day t (at Graz-Mitte for Graz-Süd), • P rec t : 1 = precipitation, 0 = no precipitation on day t (at Graz-Nord).
Values of the three meteorological variables for day t have to be delivered at day t − 1, the day of the prediction.Hence, in operational mode they are only available as meteorological forecasts.Below, in Subsection 4.2 we show the reliability of our PM10 predictions based on these meteorological forecasts for test data from the cold season 2009/2010.Of course, further crucial variables for our models are the so-called lag values of PM10 and the categorized temperature: • Ap lag : average of 1/2h-measurements of PM10 from 12.30 day t − 2 to 12.00 day t − 1, • T lag : categorized average of 1/2h temperature measurements at 12.30 on day t − 2 to 12.00 on day t−1 (= 0 if temp>0, = 1 if temp≤0) (at Graz-Mitte for Graz-Süd).
Domestic fuel is one of the human impact factors.Naturally, during frosty periods more heating is necessary, i.e. the effect of domestic heating may be estimated partially by measures of air temperature.Because of the high correlation between temperatures on proximate days, it may be feasible to include the lag value T lag .In operational mode, the predictions for day t has to be published in the afternoon (in Graz at 14.00) of day t − 1, so the observed values A lag and T lag are available on time.
An additional human impact is reflected to some extent by weekday/weekend differences of PM10.It is very likely that in Graz this effect arises from the reduced traffic load (see Hörmann et al., 2005, Stadlober et al., 2008).Additionally, we observed that the characteristics of PM10 changes during the winter season (higher levels than expected in February and March than expected by meteorological conditions).To model these effects we include four dummy variables indicating Saturday, Sun-/Holiday, February and March.

Sites and database in Brno
Brno, the second largest city of the Czech Republic with about 430 k residents is an economical, cultural and political center.The city is located in a basin at 190 m to 425 m above sea level.The basin is open only in the south and it is surrounded by arboreous hills in the other three cardinal directions.There are two rivers going through Brno and confluenting in the south of the center.The local air quality is significantly affected by the concentration of industry and motor traffic, as it lies on a road junction of highways D1 and D2.
It can be claimed that the air pollution by dust aerosol is connected especially with the winter season, when disproportionately higher emissions of the local heating plants appear.Moreover, the temperature inversions as well as adverse dispersion conditions are more frequent.On the other hand, snowfalls absorb the dust aerosol better than rainfalls.In the summer season the days with extreme values of PM10 appear during the periods of higher temperatures and lower air humidity.
In the metropolitan area of Brno we investigated the three monitoring stations Arboretum, Židenice and Zvonařka (see the map of stations in Brno, Figure 2).The site Arboretum is located in the botanical garden of the Mendel University of Agriculture and Forestry (part of the inner city area).Here we had access to data from three cold seasons (2006/11 -2007/3, 2007/10 -2008/03, 2008/10 -2009/03, 483 observations and 26 missing values).The site Židenice is situated near a heavy-traffic street surrounded by warehouses, manufacturing plants and a railway line.Zvonařka station is a traffic spot within an industrial area.For the last two stations data from two cold seasons were available (2007/11 -2008/03, 2008/10 -2009/03, 313 observations and 4 missing values).
Meteorogical data together with their predictions (based on 6h-measurements) are provided by the Brno Regional Office of the Czech Hydrometeorological Institute.Note that these data are valid for the whole Brno agglomeration and therefore are not station specific.
The future development of the pollution of dust aerosol in Brno will be apparently fixed with constant year fluctuation and with maxima at the end of winter.A gradual increase of monthly PM10 averages caused by more frequent extreme traffic situations in combination with adverse dispersion conditions is possible.
Similarly to Section 2.1, Table 3 and 4 provide some core data characteristics at Arboretum and Židenice respectively.Results for the station Zvonařka are analogous to those for the site Židenice.
The number of crucial days at cold seasons, i.e. days where PM10 exceeds the EUlimit of 50 µg/m 3 , at the Brno stations is provided in Table 5.For the site Arboretum we observed between 2 (1 %) (2008/2009) and 29 (23 %) (2006/2007) exceedances per cold season.The corresponding values at Židenice and Zvonařka were remarkably higher.
The following covariates turned out to be important in forecasting daily averages Ap t of PM10 for day t in the period from October 1 to March 31: • Ap t : average of 24 1h-measurements of PM10 on day t (1:00 to 24:00).
The progression of the pollution at stations is significantly influenced by the wind direction.The wind speed affects the pollution differently -higher emissions come up     during stronger circulation (pollutant is brought from distant sources) as well as weaker circulation (the air is ventilated worse and polluting aerosol is accumulated).Unlike in Hrdličková et al. (2008) and Veselý et al. (2009), here the predicted wind direction D t and wind speed V t were considered in the model.Note that wind direction is measured as an oriented angle between the vector pointing north of the station and the observed wind direction vector pointing to the station.Nevertheless, the same projections • V t sin D t (pr): average of 4 predictions of projections of the wind vector at 0.00, 6.00, 12.00, 18.00 on day t, • V t cos D t (pr): average of 4 predictions of projections of the wind vector at 0.00, 6.00, 12.00, 18.00 on day t were included as in Hrdličková et al. (2008).Note that this reparametrization provides an estimate of the angle S [ • ] between the vector pointing north of the station and the vector pointing from the direction of maximum pollutant concentration to the station S.
After estimating the parameters of the model, the corresponding component of the linear predictor can be reparametrized by the sum formula to β S V t cos(D t − S).Note that because of this reparametrization after the parameter estimation, the variables V t sin D t (pr), V t cos D t (pr) were either both included or both excluded when searching for sufficient models in Sections 3.1 and 3.2.Furthermore, the following lag variables of PM10 and temperature were included: • Ap lag : average of 1h-measurements of PM10 from 13.00 day t−2 to 12.00 day t−1, • T lag : average of 4 measurements of temperature at 18.00 on day t − 2 and at 0.00, 6.00, 12.00 on day t − 1.
In contrast to the stations in Graz, here the average temperature on day t was also considered: • T t (pr): average of 4 predictions of temperature at 0.00, 6.00, 12.00, 18.00 on day t.
Note that data on precipitation were not available for Brno stations.Therefore, we used the predicted variable: • C t (pr): average of 4 predictions of cloud cover at 0.00, 6.00, 12.00, 18.00 on day t.
Cloud cover is a degree of coverage of the sky by clouds and indirectly provides an information on the length of the sun shine.It ranges from 0 for completely clear sky to 10 completely overcast sky.
The effect of domestic heating was represented by the following variable: • HS t : activity of heating plants on day t, days in same month have the same values (Oct: 0.08, Nov: 0.14, Dec: 0.17, Jan: 0.19, Feb: 0.16, Mar: 0.14).
For the same reasons as in Section 2.1, the following dummy variables indicating Weekend, Sun-/Holiday, February and March were included: • weekend: 1 = Weekend, 0 = Working day t, • sunday: 1 = Sunday/Holiday, 0 = no Sunday/Holiday on day t, • febr: 1 = February, 0 = no February on day t, • march: 1 = March, 0 = no March on day t.
Note that considering the variable weekend instead of saturday (in models for Graz data) allows us to test the hypothesis, that there is no difference between the level of PM10 on Saturday and Sun-/Holiday.When testing a model without weekend to be sufficient, the variable sunday was also excluded.

Multiple linear regression
The regression models are designed for the dependent variable √ Ap t .To assure a constant error variance and to avoid a violation of the model assumptions we found the square root transformation to be a suitable choice.
Model assumption: Let x (i) t (i = 1, 2, . . ., m) be a pool of metric input variables and d (j) t (j = 1, . . ., p) be a pool of dummy (0/1) input variables at day t.Then we assume that the following linear model for √ Ap t holds: For each site the parameter estimates βk , k = 0, . . ., m + p are chosen via a stepwise regression procedure from the candidate set of input variables (Graz, Subsection 2.1, Brno, Subsection 2.2).One remarkable feature of this simple linear model approach is that the parameters and results can still be interpreted conveniently.In contrast to other rather complex models with numerous input parameters and/or functional relationships generated by a black box mechanism, linear models of this type are still transparent for the user (see Hörmann et al., 2005, Stadlober et al., 2008).
In Graz-Mitte and Graz-Süd all variables of the specified pool were found to be significant.The corresponding results are given in Table 6 below where the covariates are ordered according to their level of significance expressed by the t-value at Graz-Mitte.Accordingly, temperature inversion ∆T t and the lag value Ap lag of PM10 are the most important ones, followed by three covariates of similar importance: average wind speed V t , the lag value of temperature T lag and the sunday effect.However, at the sites in Brno only 4 to 8 variables of the predefined pool are important (see Table 7).The most significant ones for the less exposed test point Arboretum seem to be the lag value Ap lag of PM10 and the projections of the wind vector.On the other hand, at the test points Židenice and Zvonařka, which are near heavy-traffic streets, the projections of the wind vector, the lag value T lag of temperature, the cloud cover C t (pr) and the weekend effect exhibited their significant influence.We observed that the covariate sunday had no relevance at all.Hence it does not appear in Table 7.
Note that, unlike to Graz-Mitte and Graz-Süd, a negative march effect is observed.In Brno, the heating season ends in March.In this month, the temperature inversions are shorter and less frequent, and therefore the thermic stratification, which used to be more stable during winter, is labilized.The atmosphere is not ventilated under the temperature inversion, which implies a cumulation of pollutants and an increase of their concentration.The conditions are improved after a change of the conditions, when the circulation is intensified, which is demonstrated by a sufficient spread of the pollutants or their wash off by the precipitation among others.

Generalized linear models with gamma distributions
Another possibility to model the response Ap t is to assume that the measurements of Ap t are gamma distributed.Then one can apply the generalized linear model approach with log-link function (see e.g.Fahrmeir and Tutz, 1994) as used in our previous study (Hrdličková et al., 2008).
Model assumption: Let x (i) t (i = 1, 2, . . ., m) be a pool of metric input variables and d (j) t (j = 1, . . ., p) be a pool of dummy (0/1) input variables at day t.Then we assume that Ap t has a Gamma(µ t , 1/φ) distribution and log(µ t ) = log (E(Ap t |X t )) can be described as linear model of the form where φ is the dispersion parameter and E(Ap t |X t ) denotes the conditional expectation of Ap t for the given set X t of variables on the right-hand side of model ( 2).The dispersion parameter φ can be estimated by the generalized Pearson statistic and the parameter estimates β k , k = 0, . . ., p + m, are obtained by the maximum likelihood method both implemented in the procedure glmfit which is part of the Statistics Toolbox of the MAT-LAB package.We selected the best model by stepwise backward selection using the Wald statistic as selection criterion, see Fahrmeir and Tutz (1994).Of course, standard residual analyses and goodness-of-fit tests were used to check the adequacy of these models.
In Graz-Mitte and Graz-Süd all variables of the specified pool were found to be significant.The corresponding results are given in Table 8 below where the covariates are ordered according their level of significance at Graz-Mitte.Note that we obtain essentially the same order as in the multiple linear regression models of Table 6 in Subsection 3.1.
For the sites in Brno a specific combination of covariates appears.The model for Arboretum contains only 6 covariates, but the two traffic spots Židenice and Zvonařka need 8 identical covariates for modeling PM10 (see Table 9).The two projections V t sin D t (pr) and V t cos D t (pr) of the wind vector and the dummy variable march are used in all three models, whereas the lag variable Ap lag is chosen for modeling PM10 at Arboretum, and the lag variable T lag is required in models of PM10 at Židenice and Zvonařka.Here, the predicted temperature T t (pr) and cloud cover C t (pr) together with the weekend and f ebruary effect also exhibited their influence.On the other hand, we observed once more that the covariate sunday had no relevance at all.Hence, it does not appear in Table 9.The multiple linear regression models and the generalized linear models introduced in Section 3 are used as a basis for forecasting daily means of PM10 in Graz (November 3, 2009to March 31, 2010) and in Brno (October 1, 2008to March 31, 2009).As a first step, we used a historical training data set to compute the estimates βk , k = 0, . . ., m + p, for the corresponding models listed in Tables 6, 7, 8, and 9 which yield predictions √ Ap t and log(Ap t ), respectively.To obtain the estimates Ap t one has to take the square of the predicted √ Ap t or the exp of the predicted log(Ap t ).The resulting biases are negligible for our purposes (and thus will not be considered).See Cordeiro and McCullagh (1991) for discussion on the bias correction in GLM.
Clearly, in operational mode the covariates constituting values of the prediction day t are available as meteorological forecasts only.In Graz, this concerns the measure for temperature inversion ∆T t , the average wind speed V t and the precipitation as (0/1) categorized variable P rec t .The models for the stations in Brno need more sophisticated meteorological forecasts: average of 4 predictions of (i) the wind speed V t (pr), of (ii) the wind direction D t (pr), of (iii) the temperature T t (pr) and of (iv) the cloud cover C t (pr).
The quality of the PM10 forecasts is measured and compared by a quality function introduced in Stadlober et al. (2008), which is tailored to the particular needs given by the EU-limit value of 50 µg/m 3 .
The quality function Q(x, y) assigns a value between 0 and 1 to each pair (x=observation, y=forecast).Values close to 1 signify very good forecasts, whereas small values near to zero indicate bad forecasts.Based on the value of Q(x, y) we define a scoring system: where A = {x ≤ 50, y ≤ 50}, B = {x ≥ 100, y ≥ 100}, and I E is the indicator function: The function Q(x, y, has jumps near the threshold points (x = 50, y = 50), (x = 100, y = 100); it penalizes combinations Q(x > 50, y < 50) (Q(x > 100, y < 100)), in cases if the forecast y is below 50 µg/m 3 (below 100 µg/m 3 ), but the observation x is above 50 µg/m 3 (above 100 µg/m 3 ) or vice versa.Therefore, a situation in which the observation is above a threshold and forecast is below the threshold is assessed equally to a situation in which the observation is below a threshold and forecast is above it, though the second situation might not be as crucial in reality as the first one.However, the proposed Q function has been tested for six winter seasons (2004/2005 -2009/10) in Graz and has proven to be a suitable measure for the intended application (Stadlober et al., 2008).

Practical performance and comparison
In this Subsection, we compare the results of the two modelling approaches with respect to the quality function Q(x, y, in (3).We start with the linear models and generalized linear models of Graz-Mitte and Graz-Süd fitted on training data of seven cold seasons (2002/10 to 2009/03, Graz-Mitte, Table 6) and six cold seasons (2003/10 to 2009/03, Graz-Süd, Table 8).They deliver daily forecasts Ap t for test data from November 3, 2009 to March 31, 2010.
Tables 10 and 11 show a good coincidence of the quality of GLM and LM forecasts for both sites of Graz.A high correlation of the forecasts is exhibited in Figures 3 and  4 (top right panel).In Graz-Mitte it turns out that there are 70.5 % (69.1 %) very good or good forecasts.In Graz-Süd this quality is reached by both models in 66.4 % of the cases.At least satisfactory forecasts for both models can be expected with a probability of 85.2 % (Graz-Mitte) or 83.9 % (Graz-Süd).
Splitting the observations in two classes PM10(low): PM10 ≤ 50 and PM10(high): PM10 > 50, we get for PM10(low) at least 89 % and for PM10(high) at least 76 % satisfactory forecasts at Graz-Mitte and Graz-Süd.In Brno, the LM and GLM models are fitted to training data (October 1, 2008to March 31, 2009, Tables 7 and 9) and they yield daily forecasts Ap t for test data from October 1, 2008 to March 31, 2009.Table 12 for station Arboretum presents successful predictions of both GLM and LM models, probably caused by the low number of PM10 observations above 50 µg/m 3 as can be seen in Figure 5.
Table 13 (station Židenice) shows that both models reached a similar percentage of acceptable forecasts (GLM: 73.4 %, LM: 72.9 %), though the individual qualities of the predictions are quite different as can be observed in Figure 6, bottom left panel.The top right panel of Figure 6 reveals the fact that the GLM model tends to predict higher values than the LM model.
On the one hand, Table 14 for station Zvonařka reveals a high percentage of very good forecasts (GLM: 40.7 %, LM: 40.1 %), on the other hand a high fraction of bad and very bad forecasts (GLM: 28.8 %, LM: 27.1 %).GLM and LM forecasts agree for PM10(low) (70.8 % (GLM) and 74.0 % (LM) are at least satisfactory forecasts).However, they rather disagree for PM10(high) (76.7% (GLM) and 71.2% (LM) are at least satisfactory) with higher values obtained by the GLM model (see Figure 7, top right panel).

Conclusion and Recommendations
In the present paper, we have shown that PM10 forecasting models for Brno and Graz based on (i) multiple linear regression with square root transformation and (ii) generalized linear modelling with gamma distribution and log-link are suitable choices for the intended goal.They yield similar and comparable results with respect to our quality measure.Due to the simple and explicit description of the model parameters we are confident that intransparent black box approaches are not preferable in this case.The covariates are selected in order to represent the meteorological and anthropogenic situation of the locations.In general, we attach importance on the practical performance and the convenient handling of the models.For an every-day-application in the field it is necessary that both the measured covariates and the covariates obtained as meteorological forecast values are available at a certain time of the previous day t − 1.Both implementation and servicing for users has to be convenient.
With respect to our quality function 71 % (traffic spots Židenice and Zvonařka in Brno), 85 % (Graz-Mitte: near to traffic spot, Graz-Süd: near to industrial area) and 98 % (Arboretum: botanical garden) of the forecasts are assigned to categories excellent, good or satisfactory.This performance is very promising, but the models are open for further adaptations.In general, a clever choice of additional covariates may improve the reliability of the models.On the one hand, our models may serve as regular monitoring tools, on the other they may also be established as feasible and objective base for decision making in regards to traffic regulation measures.
PM10 levels observed within different meteorological scenarios for the test point Graz-Mitte, where O-D covers the period October 1 to December 31 and J-M the period January 1 to March 31 (8 winter seasons 2002/2003 -2009/2010, 1348 observations, 55 missings).Medians of covariates are calculated separately for the periods O-D and J-M.Here 'low'

Figure 1 :
Figure 1: Map of stations in Graz.

Figure 2 :
Figure 2: Map of stations in Brno.
Models for Arboretum are fitted on data from both two cold seasons (2006/10 -2008/03, 310 observations and 15 missing values).For Židenice and Zvonařka the data from the season 2007/11 -2008/03 (132 observations) were available.All models on the training data sets had access to predicted covariates (provided by the Czech Hydrometeorological Institute) which is different to the modeling of Graz data where observed values of covariates were used.These models were applied for predictions of the test data set including days of the period 2008/10 -2009/03.

Figure 3 :
Figure 3: Graz-Mitte: scatterplot of forecasts of LM versus observations scored by Q (top left), forecasts Ap t of GLM versus LM (top right), Q of GLM versus LM (bottom left), scatterplot of forecasts of GLM versus observations scored by Q (bottom right).

Figure 4 :
Figure 4: Graz-Süd: scatterplot of forecasts of LM versus observations scored by Q (top left), forecasts Ap t of GLM versus LM (top right), Q of GLM versus LM (bottom left), scatterplot of forecasts of GLM versus observations scored by Q (bottom right).

Figure 5 :
Figure 5: Arboretum: scatterplot of forecasts of LM versus observations scored by Q (top left), forecasts Ap t of GLM versus LM (top right), Q of GLM versus LM (bottom left), scatterplot of forecasts of GLM versus observations scored by Q (bottom right).

Figure 6 :
Figure 6: Židenice: scatterplot of forecasts of LM versus observations scored by Q (top left), forecasts Ap t of GLM versus LM (top right), Q of GLM versus LM (bottom left), scatterplot of forecasts of GLM versus observations scored by Q (bottom right).

Figure 7 :
Figure 7: Zvonařka: scatterplot of forecasts of LM versus observations scored by Q (top left), forecasts Ap t of GLM versus LM (top right), Q of GLM versus LM (bottom left), scatterplot of forecasts of GLM versus observations scored by Q (bottom right).

Table 1 :
Percentage of

Table 2 :
Number of observations of PM10 above 50 and total number of observations per month and season in brackets.

Table 3 :
The analogue of Table1for the test point Arboretum.

Table 4 :
The analogue of Table1for the test point Židenice.

Table 5 :
Number of observations of PM10 above 50 and total number of observations per month and season in brackets.

Table 6 :
Parameter estimates β of the linear model for √ Ap t with their standard errors and t statistics for Graz stations.The table is completed by the coefficient of determination R 2 , the standard deviation se of the model and number of observations n.

Table 7 :
Parameter estimates β of the sufficient model for √ Ap t with their standard errors and t statistics for Brno stations.The table is completed by identified angle of highest air pollution S and corresponding regression coefficient βS .Then the coefficient of determination R 2 , the standard deviation se of the model and number of observations n follow.

Table 8 :
Parameter estimates β of the sufficient GLM with log-link for E(Ap t |X t ) with their standard errors for Graz stations.Then goodness-of-fit statistics D and χ 2 ans with their degrees of freedom f and number of observations n follow.

Table 9 :
Parameter estimates β of the sufficient GLM model with log-link for E(Ap t |X t ) and their standard errors for Brno stations.The table is completed by identified angle of highest air pollution S for each station and corresponding regression coefficient βS .Then the goodness-of-fit statistics χ 2 ans and Wald statistics W together with their degrees of freedom f , number of observations n and corresponding χ 2 0.95 quantile follow.

Table 10 :
Contingency table for Graz-Mitte: Quality of prediction GLM (rows) versus LM (columns), number and percentage of categories of quality 1 to 5.

Table 11 :
Contingency table for Graz-Süd: Quality of prediction GLM (rows) versus LM (columns), number and percentage of categories of quality 1 to 5.

Table 12 :
Contingency table for Arboretum: Quality of prediction GLM (rows) versus LM (columns), number and percentage of categories of quality 1 to 5.

Table 13 :
Contingency table for Židenice: Quality of prediction GLM (rows) versus LM (columns), number and percentage of categories of quality 1 to 5.

Table 14 :
Contingency table for Zvonařka: Quality of prediction GLM (rows) versus LM (columns), number and percentage of categories of quality 1 to 5.