Forecasting The Traffic Accidents In The Philippines Finance Essay

Published: November 26, 2015 Words: 4243

Most countries have the need to know how their road accident causality statuses have developed through time (J. Broughton, 1991). These totals may be gathered and examined using monthly or annual data in order to see how their traffic systems have progressed as time goes by. This leads to the question of how these totals would significantly change in the future. This paper assesses the monthly traffic accidents data in the Philippines from 2005-2009 and forecasts the total traffic causality for each month of the year 2010. This paper tries to generalize the totality of accidents in the Philippines through the use the data that summarizes the traffic accidents in one of the most popular roads in the country which is North Luzon Expressway (84-km road composed of 8 lanes). The highway connects different provinces from North Luzon to Metro Manila which is the capital of the country.

This study focuses on the forecasting of the number of traffic accidents involving one or more drivers. It has been defined by Persson (2009) that a traffic accident is an accident that happened in a street which is open to public traffic that have been caused by one or more persons being killed or injured, and at least one moving vehicle was involved. Moreover, he said that it includes collisions between vehicles; between vehicles and pedestrians; between vehicles and animals; or between vehicles and fixed obstacles.

Forecasting is defined as the prediction of future magnitudes of economic variables or the anticipation of the likely behavior of an economic agent in light of policy interventions. The method that would be used is the Univariate Box-Jenkins methodology. This method would help the Philippine economy in order for its leaders to be aided in planning for the traffic management system in the country which would most likely cause for the improvement the economy's social welfare.

Statement of the Problem

This paper will try to forecast the traffic Accidents along North Luzon Expressway (NLEX) in the Philippines for every month of the whole 2010.

Objectives: To predict the traffic Accidents along North Luzon Express Way (NLEX) for the all the months in 2010.

Significance of Study

Forecasting road accidents is essential because we believe that if the government or the private sector wants to develop or improve on the current roads and highways, then they have to look ahead on the consequences of their planning and management. Another reason is the need for the equitable allocation of budget especially for a developing country like the Philippines wherein the budget is highly constrained. Having a better grasp on what will happen in the future will also help in policy making in terms of road traffic safety, congestion and road quality. Additional traffic enforcements and legislation of traffic laws may also be used if there is a forecasted increase in traffic accidents.

Review of Related Literature

Previous studies on Traffic Accidents

In Hermans and his colleagues' (2005) journal about traffic accidents, he was able to gather concerning the total traffic accidents in the world. It was estimated that 1.2 million people were killed in road crashes each tear and as many as 50 million were injured. Every government in the world try to develop different policies in order to try to lessen the rate of growing incidents of injuries and deaths caused by traffic accidents. Inward and outward preventive measures of traffic accidents were used in order to employ different policies which included forecasting the said problem to generate plans for the future. Inward and outward measures are the determinants of traffic accidents used with the aim of reliably measuring the latter.

In a more specific analysis, it was estimated in Italy that 270 thousand road traffic accidents occur each year which result to 330 thousand injuries and 7 thousand deaths annually (La Torre, 2009). The main causes of the rate of total accidents, based on the results of the study, are the consumption of alcoholic beverages of drivers; which make most of them more aggressive in their driving, and the number of vehicles in traffic. In addition, in the study of Iversen and Rundmo (2004), it was suggested that the main causes of traffic accidents in Norway are reckless driving, rule violation and car over speeding. Moreover, they also indicated that younger drivers, male drivers and drivers which only have had licenses for less than ten years tend to become more prone to traffic accidents because of the negative attitudes that they possess towards safety. The country's government could direct their policies on imposing stricter traffic safety policies for young drivers since most of them are caused by these drivers. Furthermore, economists in Belgium (2005) also had a study on this and they included the frequency and severity of road traffic accidents, and its long term trends. The time series data approach that they used were very useful as forecasting tools which was for determining policies and budgets to be used in road traffic safety. These causes could be used by decision makers as preventive measures in order to lessen future traffic accidents in the future.

There are also some other countries, like the United Arab Emirates, whom haven't modeled a forecasting tool so as to determine the causes of traffic accidents and to have an aid in fixing policies for future plans on the traffic system. Berner and Crundall (2005) found out that the discovery of oil during the last century made a major impact in the life of the country. They also said that there were a huge number of immigrants in the country which resulted to an increase in the number of vehicles accompanied by the expansion of road construction programs. But along with these events, was the rapid increase on the number of road traffic accidents with causalities and fatalities which created a serious economic problem. The problem could have been avoided only if studies were conducted by the government so that laws and policies were implemented for the aim of responding to the problem.

There is a wide range of earlier research on the factors of road accidents. Peltzman, in his 1975 paper on the Effects of Automobile Safety Regulations has proposed that there are several causes of traffic accidents. First of which is the level of income which according to him has an uncertain effect on traffic mortality rates because income can increase demand for safety as well as for driving intensity that increases the probability of accidents. This relationship shows that income is not a good indicator of level of traffic accidents. Another variable that Peltzman included in his model is the volume of drivers and traffic density which have positive relationships with road accidents because of the higher exposure to hazardous conditions. In contrary, some of the research done on the relationship between road accidents and traffic density showed that it may have a negative relationship. Noland's study on the relationship of traffic density in 2002 showed that higher traffic density and under congestions results to lower levels of road accidents mainly because of the decrease in the speed of cars on both of the situations resulting to minor vehicle interactions. The third determinant in Peltzaman's model on traffic accidents is the alcohol intoxication which shows that it has a positive direct effect on the number of road incidents. Peltzman has taken into consideration the fact that the younger the driver yields higher probability of accident in relation to alcohol intoxication because teenagers are relatively irresponsible in these kinds of situations. Lastly, Peltzman claimed that additional installations of safety precautions have indirect effects on traffic accidents. To further focus on the model of Pletzman, in Zlatoper's 1987 paper, he extended the later paper and included the explanatory power of the climate and traffic supervision. Just to run through these two added variables, the climate increases the chances of collision and injury if there is a presence of precipitation. Precipitation during night and day was also tested in relation to traffic accidents but did not clearly show any significant statistical effect on it (Andrey, Mills & Vandermolen, 2002).

To further discuss the variable added by Zlatoper, additional traffic supervision would result to better management of the traffic system because of an increase in road strictness (European Transport Safety Council, 1999). People who observe the frequency of road apprehensions will be more cautious with their driving thus improving the level of traffic accidents as well (Makowsky & Stratmann, 2009).

Of course, human behavior mentioned above is not the only determinant of traffic accidents. We can also include the existing road infrastructure because it may also contribute to traffic-related accidents. Improvement on roads and bridges is a widely used method to decrease the level of road accidents. Defining road infrastructures constitutes total lane miles, the average number of lanes for alternative road classes, the lane widths for alternative road classes, and the fractional percent of each road class within a given state (Noland 2002).

Forecasting Traffic Accidents

Decision making and planning should be made before any causality and loss occurs due to traffic accidents; in order to be aided in this, one may use the forecasting method. Zheng and Liu (2008) said in their journal that one of the most popular forecasting methods is the Time-series method. It uses historical time-series data as the basis of estimating future outcomes. The time series data should be a sequence of observations at different time points. The observed information may be a random process or a noisy but orderly process. Time series variation could be divided into four categories: long-term trend factor, seasonal factor, periodical factor and irregular factor. The study of Zheng and Liu (2008) also discussed that the time series method could make a group of methods, such as seasonal adjustment method, moving average method, auto-regression method and function fitting method.

Time-series methods are different from regression models because they are based on the correlation among the time series data. For the accident forecasting, the commonly used methods are exponential smoothing and ARIMA method. The method that would be discussed in this paper would be the ARIMA or ARMA method using a Univariate model which is also described as the Box Jenkins model (Box, Jenkins & Reinsel, 1997).

The ARMA is in most cases a combination of an autoregressive (AR) part and a moving average (MA) part wherein both are tried to be solved by the model. One is the analysis on the stochastic, stationary and seasonal properties of time series, and the other is model selection. The model is based on the condition that the time series used for accident forecasting has been premised by a zero-mean-valued stationary random process. Furthermore, a differencing step should be done to non-stationary time series in order for the non-stationarity would be removed. This generalized method is called autoregressive integrated moving average (ARIMA) (Zheng & Liu, 2008). ARMA was used in some researches on accident forecasting to correct the error terms. In Gandhi and Hu (1995) research on this, a differential equation model was utilized to present the accident mechanism with time-varying parameters and an ARMA process of white noise is attached to model the equation error. Moreover, in Van den Bossche, Wets, and Brijs (2004), a regression was applied to a time-series data but the error terms are often auto-correlated which means that the model doesn't optimize the use of forecasting due to the positive results on autocorrelation because the error terms would probably contain information that is not captured by the explanatory variables. ARIMA is used to model this information so that the effect of the explanatory variables on the dependent variable can be more reliably estimated.

Unexpected events or errors often influence the values of accident time series data. These outliers occur frequently in nature and have serious consequences and it is important to consider the impact of these outliers in the model through the use of statistical methods which employ to deal with these problems (Mcleod & Vingilis, 2008). An extension of the ARIMA model is the intervention analysis wherein the study of the change in magnitude and structure of time-series data is allowed. It may advisale for the outlier analysis to be based on more flexible models such as Functional autoregressive models because non-linear features of a series may behave in a more complicated way than the standard models. It was suggested by a study that the FAR-based method is effective both for series following some non-linear models and for linear series generated by ARMA processes (Battaglia, 2005).

Framework

For this study, the group will make use of the Box-Jenkins methodology which involves identifying the appropriate ARIMA process to utilize in order to get the stationary and stable data needed for valid forecasting.

Box-Jenkins Methodology

The Box-Jenkins Methodology, which just the simpler term for the Autoregressive Integrated Moving Average Models Methodology or ARIMA is the iterative procedure that determines the most desirable member of the ARIMA (p,d,q) family. It is the most popular used approach in forecasting models due to its flexibility in accommodating both the ARMA and ARIMA procedures which are sufficient in supplying the right data for forecasting.

Identification

The initial step in forecasting is to identify the model to be used where the preliminary values of p, d and q are determined. It involves transforming the data by differencing until the ADF value is already within the critical with the use of the Unit Root test to be able to determine the value of d. After establishing the stationary data, the next is to check for the Correlograms to be able to see the plots of the autocorrelation function or ACF and the partial autocorrelation function or PACF against the lag length as well as the orders of the autoregressive and moving average components. The ACF would depict the value for p, while PACF would be the one for q.

Estimation

The second step is to estimate the parameters of the AR and MA terms in the model. This is executed through inputting the values of p, d and q identified in the previous stage. The values obtained for the partial autocorrelation and autocorrelation from the stationary correlogram would be used in the formulation of the initial ARIMA equation.

Diagnostic Checking

The third step would be to analyze whether the chosen ARIMA model best suits the model. Alternative ARIMA equations would be formulated in order for them to be compared to the initial ARIMA equation and to each other. The ARIMA equation with autoregressive and moving average amounts which have the most significant values and the lowest Aikike Info Criterion and Schwarz Criterion would be the basis of the chosen equation to forecast the traffic accidents in North Luzon Expressway.

Forecasting

The final stage is the forecasting of the values itself using the computer software, which is Eviews in our case. Since forecasting until the month April 2010 is not possible with only the raw data at hand, we will also forecast the first 3 months of the year which are: January, February and March.

Data and Analysis

Stage 1: Identification Process

Before anything else, stationarity of the model should first be satisfied. In studying forecasting, we opt to have a data which has a constant mean and variance over time signifying its stationarity.

Graphical Analysis

We can see based on the graph that the data shows no signs of stationarity as evidenced by the many fluctuations

White Noise Testing

A white noise error term is said to exist when the ACF at different lags are nearing zero or the vertical line. Hence, to be able to see if a white noise error term exists, a correlogram for the variable's residuals should be examined.

Correlogram for Residuals at LEVEL

Correlogram for Residuals at 1st difference

Correlogram for Residuals at 2nd difference

The correlogram of the residuals for accidents was tested at level, first difference and second difference and based on the tests, we can see that they depict non stationarity having the autocorrelation coefficients being far from zero.

Unit-Root Testing

For forecasting, a de-trending is needed for some data before proceeding to analysis. There are two unit-root tests that will be performed, the Augmented Dicey-Fuller and Phillips-Perron test. "In particular, where the ADF tests use a parametric autoregression to approximate the ARMA structure of the errors in the test regression, the PP tests ignore any serial correlation in the test regression." "The ADF and PP unit root tests are for the null hypothesis that a time series yt is I(1)."

Augmented Dickey-Fuller Unit Root Test

Three versions of Unit Root tests will be used in testing for the data's stationarity using ADF. First is with the intercept; second will be with trend and intercept and the last would be just plain ADF. We will observe at which levels for each test will yield stationarity for the data.

Augmented dickey-fuller at level (INTERCEPT)

As seen in the figure above, the ADF statistic's absolute value of 1.774918 did not exceed the MacKinnon critical values -3.54, -2.91 and -2.59; Hence, the data is not yet stationary at ADF level (intercept) and still has to undergo the further differencing tests for the stationary of the data.

Augmented dickey-fuller at 1st difference (INTERCEPT)

In the figure above, the ADF statistic's absolute value of 5.47 exceeded the MacKinnon critical values at the 1%, 5% and 10% levels having the values of 3.54, 2.91 and 2.59. Here, we can already say that the data is already stationary at ADF first difference (intercept); therefore, we do not have to test for the second difference.

Augmented dickey-fuller at level (TREND & INTERCEPT)

Trend and intercept is another version of the Unit Root Testing, from the results shown above, we can see that the ADF statistic's absolute value 3.36 did not exceed the MacKinnon critical value at 1%. This signifies that the data is still not stationary and has to undergo the first-order differencing test.

Augmented dickey-fuller at 1stnd difference (TREND & INTERCEPT)

We can see in the first-order differencing test of intercept and level, the ADF statistic's absolute value already exceeds the MacKinnon critical values at 1, 5 and 10%. And so with this, we can say that the result is analogous to the intercept test result, both saying that the data is already stationary at first difference, hence, second-order test may not be executed anymore.

Augmented dickey-fuller at Level (NONE)

Using the third version of the Unit Root test (None) at level, we can see that the ADF statistic's value of 0.39 did not exceed the MacKinnon critical values at 1%, 5% and 10% with values -2.60, -1.94 and -1.61 respectively. This implies that the data is still not stationary at level (none), and so it still has to undergo the next difference test.

Augmented dickey-fuller at 1st difference (NONE)

Since in the figure above, the ADF statistic's absolute value of 5.43 exceeded the MacKinnon critical values at all level, we can infer that the data is already stationary at 1st difference (None) as it was in the intercept and trend and intercept Unit Root tests.

Phillips-Peron Unit Root Testing

Another test for stationarity is the Phillps-Peron Unit Root test. It is similar to the Augmented Dickey-Fuller test, which also measures stationarity for intercept, trend and intercept and none at level and at first and second-order difference.

1. Phillips-Peron at level (INTERCEPT)

As seen in the figure above, the PP test statistics value of -3.20 for the data does not exceed the MacKinnon critical values at 1% level with a value of -3.54. This tells us that at level (intercept) of Phillips-Peron, the data exhibits non-stationarity; therefore, first-order differencing must be done.

2. Phillips-Peron at 1st difference (INTERCEPT)

We can see from the results that the PP Test Statistic already exceeds the Mackinnon critical values at all levels, implying stationarity of data. In this case we do not have to test for the second-order difference anymore.

3. Phillips-Peron at level (TREND AND INTERCEPT)

At the Phillips-Peron trend and intercept tested at level, we can see from the illustration above that that the PP test statistics gave an absolute value of 5.11 exceeding all the critical levels with values of 4.11, 3.48 and 3.17. In this case, we no longer have to test for the first-order difference since the data is already stationary at level.

4. Phillips-Peron at level (NONE)

As seen in the figure above, the PP Test Statistic for the data does not exceed the MacKinnon critical values at all levels. This implies that the Phillips-Peron level (None) exhibits stationarity and needs to undergo first-order differencing.

6. Phillips-Peron at 1st difference (NONE)

As illustrated by the figure above, the PP Test Statistic for the data is greater than the MacKinnon critical values at all levels. The PP test statistic's absolute value of 11.25 is against the critical values 2.60, 1.94, and 1.61 at the 1%, 5%, and 10% levels of significance. This implies that the Phillips-Peron first-order difference (None) exhibits stationarity.

Correlogram

At LEVEL

At 1st DIFFERENCE

At 2nd DIFFERENCE

Estimation Process

Initial ARIMA

D(accidents,1) ar(1) ar(2) ar(3) ar(4) ar(5) ar(6) ma(1) ma(2)

Alternative ARIMA Equations

D(accidents,1) ar(1) ar(2) ar(3) ar(4) ar(5) ma(1) ma(2)

D(accidents,1) ar(1) ar(2) ar(4) ar(5) ar(6) ma(1) ma(2)

D(accidents,1) ar(1) ar(2) ar(3) ar(4) ar(5) ar(6) ma(1)

D(accidents,1) ar(1) ar(2) ar(3) ar(4) ar(5) ma(1)

Diagnostic Checking

Table 1: A summary of the results of the Initial and alternative ARIMA models tested above

Criterions

ar(1) ar(2) ar(3) ar(4) ar(5) ar(6) ma(1) ma(2)

ar(1) ar(2) ar(3) ar(4) ar(5) ma(1) ma(2)

ar(1) ar(2) ar(4) ar(5) ar(6) ma(1) ma(2)

ar(1) ar(2) ar(3) ar(4) ar(5) ar(6) ma(1)

ar(1) ar(2) ar(3) ar(4) ar(5) ma(1)

RMSE

25.43908

28.31251

27.96979

28.87555

28.64975

MAE

19.13251

22.55723

21.78302

21.32527

22.43590

MAPE

9.308163

11.20157

10.69067

10.34280

11.24850

TIC

0.060076

0.067540

0.066468

0.068463

0.067710

BP

0.031601

0.042241

0.056590

0.041514

0.001259

VP

0.026170

0.048119

0.064441

0.040495

0.159766

CP

0.942229

0.909640

0.878969

0.917991

0.838975

AIC

9.606458

9.779030

9.759386

9.823127

9.766348

SIC

9.901122

10.03451

10.01722

10.08096

9.985330

DW

2.004217

1.993362

2.008233

1.983163

2.054777

R-Squared

0.463495

0.324799

0.351254

0.308557

0.308618

The criteria used in selecting the best model are Root Mean squared error, Mean absolute Error, Mean Absolute Percent Error, Theil Inequality Coefficient, Bias Proportion, Variance Proportion, Covariance Proportion, Akaike Information Criteria, Schwarz Information Criteria, Durbin-Watson, and R-Squared. In selecting the best model, the model must have the lowest values on all criteria except for the R-Squared, Durbin-Watson and Covariance Proportion. As we can see from the table, our initial ARIMA equation gave out almost all the "best" values amongst all criterions giving the lowest Akaike info Criterion and Schwarz Criterion values of 9.606458 and 9.901122 respectively. Moreover, the initial ARIMA equation also resulted to very high significant values of 0.0000.

After being able to choose the "best" forecasting model for the data, we would now have to test for white noise using the residual's correlogram q-statitic.

Since the correlogram, which was obtained from gretl, showed that no spike went beyond the band and almost all term resulted to significant values, we are lead to believe that the model does indeed depict white noise or stationarity.

Forecasting

Now that the final model has been chosen and has already been tested to be white noise or stationary, we are now prepared to forecast using our initial (best) ARIMA model of ar(1) ar(2) ar(3) ar(4) ar(5) ar(6) ma(1) ma(2).

Forecasted Values

Using the Gretl program, the forecasting of all the months of 2010 was done at ease using the chosen/best ARIMA model. The point-click procedure was much simpler than the ex-ante forecasting in Eviews which was complicated and time consuming. Moreover, in the middle of the table above, the standard errors are given for each month. Lastly, on the right side of the forecasted values, the optimistic and pessimistic forecasted values could be seen which would show 95% confidence that the traffic accidents in each month would fall in between those values.

Conclusion

*patty,bunch paAPA format naman ng base article

Sources

Forecasting road accident casualties in Great Britain Accident Analysis & Prevention, Volume 23, Issue 5, October 1991, Pages 353-362 J. Broughton

Hermans, E., Wets, G. & Van den Bossche, F. (2005). The Frequency and Severity of Road Traffic Accidents Investigated on the Basis of State Space Method. Retrieved October 26, 2009 from doclib.uhasselt.be/dspace/bitstream/1942/1500/1/frequency.pdf

La Torre, G., Van Beeck, E., Quaranta, G., Mannocci, A. & Walter, R. (2007) Determinants of Withing Country Variation in Traffic Accident Mortality in Italy: A Geographical Analysis. International Journal of Health Geographics. Retrieved October 26, 2009 from http://www.ij- healthgeographics.com/content/6/1/49

Iversen, H. & Rundmo, T. (2004). Attitudes Toward Traffic Safety, Driving Behaviour and Accident Involvement Among The Norwegian Public. Ergonomics, 47, 5, 555-572. Retrieved October 29, 2009 from http://www.tandf.co.uk/journals

Hermans, E., Brijs, T., Stiers, T. & Offermans, C. (2004). The Impact of Road Conditions on Road Safety Investigated in an Hourly Basis. Retrieved October 27, 2009 from uhdspace.uhasselt.be/dspace/bitstream/1942/9525/1/ntensity.pdf

Bener, A. & Crundall, D. (2005). Road Traffic Accidents in United Arab Emirates Compared to Western Countries. Advances in Transportation Studies an International Journal, A6. Retrieved October 28, 2009 from www.salimandsalimah.org/documents/RTAsinUAEcompared.pdf

Peltzman, S. (1975). The Effects of Automobile Safety Regulation. The Journal of Political Economy, 83, 4, 677-726. Retrieved October 29, 2009 from http://www.jstor.org/stable/1830396

Peltzman, S. (2004). Regulation and the Natural Progress of Opulence. AEI-Brookings Joint Center for Regulatory Studies. Retrieved October 31, 2009 from http://aei- brookings.org/admin/authorpdfs/redirect-safely.php?fname=../pdffiles/phpUS.pdf

Zlatoper, T. (1987). Factors Affecting Motor Vehicle Deaths in the USA: Some Cross-sectional Evidence. Applied Economics, 19, 753-761. Retrieved October 29, 2009 from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1067816