Many applications, from weather forecasts to agro-advisories to disaster management, require meteorological forecasts at various scales. In spite of their many limitations, dynamical models have emerged as most versatile tools for generating forecasts at different spatial and temporal scales (Bjerknes 1969; Holton 1992; Fowler 1997; Wallace and Hobbs 2006). Although significant progress has been made in terms of model physics as well as numerics in recent years, the skill of dynamical models is still limited in many aspects (Rodwell and Palmer 2007). The identification of the model discrepancy and ability is crucial while selecting the dynamical models for the purpose of weather and climate prediction and for further improvements in model physics and configurations (Kalnay 2003; Palmer and Hagedorn 2006; Swanson and Roebber 2008; Nicolis et al. 2009). The characterization of the forecast errors and its sensitivity to the choice of different physical parameterizations provides greater confidence in the model outputs and qualifies its application for predicting the real time events (Pielke, 2002). This can be achieved by rigorous validation and sensitivity studies of the simulations carried out by these models for various synoptic conditions and initialization datasets (Kalnay 2003; Frehlich and Kelley 2008; Frehlich 2011).
A major difficulty in addressing model systematic error (bias) is that it may originate from a number of sources, such as model numerics, model physics as well as forecast methodology (Mass et all. 2002, 2003; Wu 2005; Dee 2005; Monache et al. 2006; Gel 2007; Judd et al. 2008; Houtekamer et al. 2009; Bao et al. 2010; Goswami et al. 2011). As the bias due to numerics and forecast methodology may primarily arise from the projection of model data on a given horizontal and vertical grid to point observation, this part of the bias may be expected to be somewhat systematic in nature (Steed and Mass 2004; Abramowitz et al. 2007; Mass et al. 2008). In particular, this bias primarily arises from an adopted grid and methodology for interpolation. Identification of this bias of a given model and its removal through an objective method can significantly improve forecast skill (Stewart and Reagan-Cirincione 1991; Schubert and Chang 1996; Mass 2003). However, the efficiency of the removal of such bias depends strongly on the algorithm used. Further, while an algorithm may be universal in its potential application, it needs to be calibrated and validated for certain regions. The present thesis describes an objective bias correction applied to a number of scales.
2.2 Objective and Scope of work
The thesis describes development of new algorithms for objective bias correction of model simulations for achieving enhanced and significant skill in advance forecasting. While the focus is on the Indian region, we consider a number of spatial and temporal scales, from station scale to tropical cyclone, and from daily to monthly forecasts. Similarly, we consider more than one dynamical model to evaluate our method. The evaluation and validation encompass both comparison against actual observations and relative performance with respect to other methods as well as null hypotheses. Chapter-wise outline of the thesis is provided below.
This chapter begins with a brief description of the Numerical Weather Prediction (NWP), different types of NWP model error and the sources of error in the simulation of dynamical forecasts. The major contributions by earlier researchers on different bias correction methods have been briefly described. This chapter also deals with the basic principle objective non-linear bias correction method in the model simulation with respect to observations. Finally, the chapter describes the methodology for evaluation and validation.
Chapter-2 Objective Bias correction with GCM Forecasts: Tropical Cyclone
A brief description of the tropical cyclone over north Indian Ocean basins, events and data used and different objective bias correction algorithm used to improve forecast skill of intensity, track and cylogenesis are outlined in Chapter 2. The intensity and location forecasts are generated using an optimized configuration of a variable resolution Global Circulation Model (VR-GCM) that combines the advantages of a limited area model and a global circulation model (version LMDZ3 developed at LMD, France). This chapter also provides the result to generate intensity and location forecasts with enhanced reliability from raw forecasts using objective non-linear debiasing.
Chapter-3 Objective Bias correction: Station-Scale Forecasts from LAM
While meso-scale models today can support horizontal grid spacing down to a few kilometers or less, downscaling of model forecast to arrive at station scale values will remain a necessary step for many applications. This chapter gives the details of the bias correction algorithm developed in this research for station-scale forecasts 2-m temperature using Pennsylvania State University/National Center for Atmospheric Research (PSU/NCAR) meso-scale model version 3 (MM5) over the Indian region. For station-scale forecast, it is shown that the skill of the debiasing forecast is significant against raw forecasts. This chapter also shows the comparison of objective non-linear debiasing forecasts with other methods of station-scale forecasts.
Chapter-4 Objective Bias Correction: Meso-scale Forecasts.
Meso-scale forecasts are important for many applications, especially for high-impact weather over urban locations; the complexity and the variability of the flow due to structures and anthropogenic activities of a mega city pose new challenges to meso-scale forecasting. This chapter describes the source of observation, structure of meso-scale observation and bias correction algorithm for 2 meter temperature of different session. In this study we consider four stations located in and around Delhi separated by aerial distance of a few kilometers. It is shown that for all the four locations the skill of the debiased forecast is significant against the skill of raw forecasts and significantly higher than that of a 7-day running mean error removal (null hypothesis).
Chapter-5 Objective Bias correction: Station-Scale Analysis from Reanalysis
Reanalysis data like those from National Center for Environmental Prediction (NCEP), provide a versatile, and sometimes only, long-term, multi-variable data at a location. However, Reanalysis data is known to contain significant biases that depend on geographical location. This chapter describes impact of the bias correction algorithm applied to NCEP Reanalysis for multiple locations over India. It is shown that objective bias correction can effectively improve quality of such analysis at station-scale.
Chapter-6: Conclusions and Future Directions.
This chapter includes a summary of the entire study and the major findings. This chapter also discusses some of the limitations and potential improvements of the algorithm as well as applications in other areas. Finally the chapter point towards the future directions of the work done. One of the points discussed is the possible extension of the algorithm to forecasts of extreme rainfall events.
3. Objective Non-linear Debiasing:
A major difficulty in addressing model bias is that may originate from a number of sources, such as model numerics, model physics as well as forecast methodology (Wu et al. 2005; Monache et al. 2006; Gel 2007; Judd et al. 2008; Houtekamer et al. 2009). As the bias due to numerics and forecast methodology may primarily arise from projection of model data on a given horizontal and vertical grid to point observation, this part of the bias may be expected to be somewhat systematic in nature (Cui at al. 2008). In particular, this bias primarily arises from an adopted grid and methodology for interpolation. Identification of this bias of a given model and their removal through an objective method can significantly improve forecast skill (Stewart and Reagan-Cirincione 1991; Schubert and Chang 1996; Mass 2003). However, the efficiency of removal of such bias depends strongly on the algorithm used. Further, while an algorithm may be universal in its potential application, it needs to be calibrated and validated for certain region.
Removal of systematic pasts of error (or bias) through an objective bias correction (referred to as debiasing) procedure can significantly improve forecast skill (Stewart and Reagan-Cirincione 1991). The bias in the raw forecasts was found to be dependent on the value of the forecast. The generic methodology of objective non-linear debiasing adopted in this thesis is to determine debiasing parameters based on a training set and then apply the method with these parameters on a test set.
We consider a debiasing algorithm as
XD = XR [1+ α - βxXR ] (1)
Where XD is the debiased forecast, XR is the raw forecast. The optimum values of α and β were obtained through a search procedure to minimize |XF - XO| where XF and XO represent, respectively, the forecast and the observed values. In this procedure a range of values of α and β are considered with sufficiently small intervals to arrive at the optimum values characterized by the lowest |XF - XO| for the training sample.
Two different types of objective nonlinear debiasing have been considered in this thesis. First, optimum values of debiasing parameters were obtained using the total number of forecasts. As this procedure (referred to as potential debiasing) uses in-sample data, the skill assessed is not strictly realizable, and we shall refer to the skill as potential skill. The skill with potential debiasing essentially is indicative of maximum skill attainable with the procedure, or enhancement in skill likely if larger training samples were available. The optimum values of debiasing parameters are obtained by using a search procedure for minimum average error for the training set. The search algorithm begins with initial values of debiasing parameters (generally from large negative values) and allows small increments at each step until final (generally large positive values) values are reached; the pair with minimum error is adopted. Secondly, for a more realistic assessment of (realizable) skill through non-linear debiasing we have next applied the debiasing algorithm to a training set to determine the debiasing parameters; these debiasing parameters are then applied outside the training set, for cross-validation of debiasing parameters (Wilks 2006; Hamill et al. 2008). Different size of training set has been considered for different study of bias correction like station scale forecast, tropical cyclone intensity, location forecasts and meso-scale forecasts. The details of the bias correction algorithm and methodology have been discussed separately in later chapters.
4. Tropical Cyclone Forecasts
With increasing socio-economic activities in coastal areas the world over, the damage potential of tropical cyclones has considerably increased. While there has been significant progress in modeling and forecasting of cyclones over the past decades, there is also growing expectation and demand for longer range and higher accuracy (Anderson 1996; AMS council 2000). Apart from the storm surge, the damage potential of a tropical cyclone is primarily determined by the strength (intensity) of the wind, either in terms of direct damage or induced storm surge. In particular, the peak intensity of a storm is a measure of its maximum damage potential. Thus an accurate and advance prediction of the peak intensity is a critical input for efficient and pro-active disaster management. One factor that limits skill of the numerical models is the bias in the model forecasts with respect to observations. This study shows that a non-linear objective debiasing can generate intensity and location forecasts with enhanced reliability from raw forecasts. The forecast for thirty-six cases of storms and cyclones over north Indian Ocean during 1990-2009, are generated using an optimized configuration of a variable resolution Global Circulation Model (VR-GCM) that combines the advantages of a limited area model and a global model; version LMDZ3 developed at LMD, France (Sadourny and Laval 1984; Sharma and Sadourny 1987; Sabre et al. 2000; Goswami and Gouda 2009, 2010). The hindcasts were carried out in a completely operational setting, that is, without assuming any observed information beyond the day of the initial condition.
Table 1. Analysis of Skill Score, Average Absolute Error and Average Bias for raw, non-linear potential, nonlinear realizable and climatological forecasts.
Average Absolute Error (m s-1)
Bias (m s-1)
A scrutiny of skill scores, average absolute error (m s-1) and average bias for different type of intensity forecasts of tropical cyclone (Table 1) shows that while raw forecast and the climatological forecasts have very low or zero skill score, the non-linearly debiased forecasts have appreciable skill score. In terms of average absolute error, both potential and realizable skill of the non-linear debiasing forecast has only 5.1 m s-1 and 6.3 m s-1, compared to raw forecast and the climatological forecast (11.6 m s-1 and 12.9 m s-1).
We have combined an objective debiasing to improve skill of intensity and location forecast of cyclones over the north Indian Ocean. Objective non-linear debiasing not only improves forecast skill above raw forecast, but the skill score is appreciable in general. The consistent performance of the model for different conditions makes it an attractive tool for tropical cyclone forecasting. In an actual application one could generate multiple forecasts, each for an ocean basin, with a grid that has been critically evaluated for the basin
While there are few studies on intensity and track forecast (Kotal et al. 2009, Roy Bhowmik and Durai, 2010) for cyclones over the north Indian Ocean, there have been several methods and studies on intensity and track forecast over the Atlantic and Pacific (DeMaria and Kaplan 1999; Knaff et al. 2003, 2005; Blackerby 2005; DeMaria et al. 2007). The National Hurricane Centre (USA) has documented skill (NHC Report 2008) in forecasting of track and intensity for cyclones over the Pacific and the Atlantic from a number of models for leads 12-72 hours (NHC Report 2008).
Figure 1 A comparison in terms of average absolute error (Km) for debiased forecasts (DFR) with the standard error from various operational agencies; European Centre for Medium-Range Weather Forecasts (ECMWF), Global Forecasting System (GFS) , Japan Meteorological Agency (JMA) and India Meteorological Department (IMD) at different forecast hour. The numbers in the panel respectively represents the averaged forecasts error (Km) up to 72-hour for the corresponding case (Kotal et al. 2009, Roy Bhowmik and Durai, 2010).
A comparison of track forecast in terms of average absolute error (Km) for debiasing forecasts (DFR) with the standard error from various operational agencies, like European Centre for Medium-Range Weather Forecasts (ECMWF), Global Forecasting System (GFS), Japan Meteorological Agency (JMA), India Meteorological Society (IMD) at different forecast hour (Figure 1) [For TC track prediction, IMD operationally runs three regional models, Limited Area Model (LAM), fifth-generation Pennsylvania State University-National Center for Atmospheric Research Mesoscale Model (MM5) and Quasi-Lagrangian Model (QLM) for short-range prediction]. For 12-hour forecast, the debiased forecasts are less skillful then ECMWF, GFS and IMD-QLM forecasts, but have greater skill than those of JMA and IMD-MM5 (Kotal et al. 2009, Roy Bhowmik and Durai, 2010). The 24-hour DFR forecasts are superior to all the other except ECMWF for which the skill is comparable. At other forecast hours, the DFR forecast have generally better skill. In particular, the average error over the 12 to 72 hour forecasts is the lowest for DFR. It is also interesting to note that growth of error for DFR is the lowest and is essentially constant as against rapid growth of error for the other forecasts.
Table 2. A comparison of performance of non-linear debiasing potential skill (DFP) and realizable skill (DFR) with current intensity forecast by different methods.
(% of cases)
Percentage (%) of cases
UP - UO < -5 m s-1
UP - UO > 5 m s-1
A comparison of skill in terms of under warning and over warning of the non-linear debiasing with six other methods including SHIPS, DSHIPS and NHC (Blackerby 2005; Boothe et al. 2006) shows the skill of the presented method to be comparable in some cases but not superior; in general, a distinct pattern does not emerge (Table 2). It should be emphasized, however, that a comparison of skill over north Indian Ocean with that over the Atlantic or the Pacific is not strictly valid as the tropical cyclones over these basins have quite different characteristics. Thus a more robust comparative evaluation of skill would require other methods of intensity forecasts applied to the north Indian Ocean basin. Further, the lead of the forecast in the present case is often more than 72 hours that is used in calculating skill for the other methods.
Figure 2 The 12 locations over India considered in this study for objective debiasing. The latitude (0), longitude (0), altitude (m) and pressure level (mb) of each station are mentioned along with the station name. The abbreviation used for station name is given in bracket.
5. Station-Scale Forecasts from Limited Area Model (LAM)
Station-scale forecasts are necessary for many applications related to health (such as vector borne disease), agriculture (such as germination potential) and industry (such as power requirements), where the diurnal cycle of temperature plays a critical role. Such station-scale forecasts from dynamical models have to be necessarily obtained through a procedure of downscaling. Similarly, while typical climate simulations generate fields averaged over thousands of square kilometers, many applications require meteorological filed at local scale. The meso-scale models today can support horizontal grid spacing down to a few kilometers or less, downscaling of model forecasts to arrive at station scale values will remain a necessary step for many applications. While generic improvement in model skill requires parallel and comprehensive development in model and other forecast methodology, one way of achieving skill in station scale forecasts without (effort-intensive) calibration of model is to implement an objective bias correction (referred to as debiasing). We show that a non-linear objective debiasing can transform zero-skill forecasts from a meso-scale model (MM5) to forecasts with significant skill. We consider 12 locations over India (Figure 2) representing urban sites in different geographical conditions during May-August, 2009. The model MM5 was integrated for 24 hours with initial conditions from [global gridded analysis (FNL)] of the National Centers for Environmental Prediction Global Forecast System (Final) for each of the days of May-August 2009 in a completely operational setting (without assuming any observed information on dynamics beyond the time of the initial condition).
Figure 3 Percentage of days (out of 123 days for RF and DFP, 83 days for DFR) for which daily averaged temperature is in the error bin -1 to 1 0C from May to August 2009 for each station. The average of all station is given with the legends. The observed station temperatures are adopted from IMD observations, while the station forecasts are downscaled from model forecasts at 10 km resolution.
A summary of skill averaged over the 12 stations for each of the four months shows raw forecasts to have essentially zero or negative skill score in all cases. The realizable skill (DFR), while generally lower than potential skill (DFP) as expected, is significant for all the four months. The average errors in daily average temperature, minimum and maximum daily temperatures are generally less than 1 0C for both DFP and DFR. In terms of percentage of days for which the daily average error is between -1 0C to +1 0C, the debiased forecasts appear to be far superior to raw forecasts for all the months; while nearly 76% (80%) of days for DFR (DFP) lie within -1 0C to +1 0C when averaged over all the for months, the corresponding percentage for raw forecast is only 39% (Figure 3).
Table 3. A comparison of the performance of raw forecast (MM5-RF) and debiased forecast (MM5-DFR) in terms of Mean Absolute Error (MAE), Bias Error (BE) and Root Mean Square Error (RMSE) in maximum (Tmax) and minimum (Tmin) temperature by different methods. For MM5-RF and MM5-DFR all parameters are calculated out of 123 days and 83 days respectively from May to August, 2009.
MAE (%) < 10
RMSE in Tmax (0C)
RMSE in Tmin
* June - August, 2004 (Cheng and Steenburgh, 2007),
+ May - August, 2009; average over 12 stations, (Present method)
† June - September, 1997-2000; average over 12 stations (Maini et al, 2003).
A comparison of skill of the present method with a number of other methods [like the Eta Model (ETA) which was renamed the North American Meso (NAM) model in 2005, model output statistics with ETA model (ETAMOS), the Kalman filter with ETA model (ETAKF) and a 7-day running mean bias removal with ETA model (ETA7DBR)] (Maini et al, 2003, Cheng and Steenburgh, 2007) for debiasing shows (Table 3) the present method to have generally better skill.
6. Conclusion and Future direction.
The methodology of debiasing explored here improves forecast skill over raw forecast; thus the skill achieved for specific locations does not necessarily reflect skill over the domain as a whole. It is conceivable to generate fields of debiasing parameters on a grid if sufficient observations are available; this possibility, which requires considerable effort, will be explored in a separate work. Thus, especially for operational applications, more optional model configurations may yield higher skill.
One of the important question of inter annual variability in the station variables, and hence the stability of the debiasing parameters over a period of time (years). This issue is important for actual implementation of the method. In particular, it will be necessary to examine the effectiveness of the objective debiasing for a number of years based on calibration of debiasing parameters for any given year. The true forecast potential of the methodology can be only judged when it is applied to other years with the same debiasing parameters for the month and the stations.