Abstract
Introduction
Traffic flow forecasting has become a crucial subject of intelligent transportation systems, playing a fundamental role due to clear impact on daily life. Short-term traffic flow prediction used for planning and development of traffic management of control systems provides important data support to enhance real-time dynamic characteristics of traffic control systems. Predicting traffic flows in short term provides traffic managers basic information to improve and harmonize traffic states and environment. At the same time, unreliable traffic flow data of traffic systems may cause serious consequences. Therefore, reliable traffic flow prediction methods are required for time scales minutes, as well as adjusting of uncertainty information for optimization of control processes.
Different methods of forecasting of traffic flows were proposed in the past. In general, short-term traffic flow prediction methods can broadly be divided into two groups based on the number of data sources, namely, based on fusing multi-source data,1–6 which try to increase profits to improve forecasting precision, and based on traffic data from a single source data.7–10 Recent studies attempt to improve accuracy of prediction methods using multi-source data. The number of detectors to be used is considered to acquire more reasonable traffic networks for traffic flow prediction. 11 However, for short-term traffic flow forecasting based on multi-source data, there are still several problems that need to be addressed, such as how multi-source data can be associated and processed (since multi-source data may belong to different temporal scales and dimensions, it is quite complex and difficult to harmonize internally). Furthermore, traffic detectors with overlapping and repetitive content do not have universal significance in traffic networks except special sections.
There are many methodologies in the literature dedicated to short-term traffic flow forecasting using single sources of traffic data from same types of detectors, which have more extensive applicability than forecasting methods based on multi-source data. Short-term traffic flow prediction models include nonlinear and linear models. Artificial neural networks,12–15 k-nearest neighbors,16–18 and the online support vector regression (SVR) 19 are nonlinear models, which belong to supervised machine learning methods that can learn some relationships between input and output, whose main disadvantage is that model training and parameter calibration need large data and time. Linear traffic flow prediction models include Kalman filtering20,21 and the autoregressive integrated moving average (ARIMA) model.22,23 Abadi et al. 24 implemented both traffic flow completion methodology and simple traffic flow prediction based on time series methods taking into account the uncertain nature of traffic and historical traffic data of a road network. The ARIMA models based on past values of the modeled time series were applied in different real scenarios; yet their model calibration is quite time consuming. Kalman filtering is an elegant model to implement online and does not require heavy computations. 25 In addition, prediction models using Kalman filtering forecasts turn out to be quite reliable for short-term forecasting and show considerable experimental results. Guo et al. 26 and Lu et al. 27 used Kalman filtering to improve initial predictions of a limited area time series model. Ojeda et al. 28 proposed two forecasting methods based on Kalman filtering for multi-step traffic flow prediction. The recent literature based on one-dimensional traffic flow time series attempts to present novel prediction methods to enhance accuracy without considering data integrity. Generally, Kalman filtering is more suitable for the occasion of linear and slow change. If traffic accident occurs, the traffic flow will increase or decrease significantly in short period. The proposed method cannot predict the traffic flow value as exact as normal traffic flow without traffic accident. However, the proposed method performs better than the original Kalman filtering, because the self-adaptive weight coefficients in the proposed method can handle the fluctuation of traffic flow to some extent.
Considering time-variety and complexity of traffic flow parameters and some disadvantages of prediction methods based on one-dimensional traffic flow time series, in this article, we focus on incorporating models to establish a fusion prediction method based on Kalman filtering and the idea of adaptive weights allocation. The fusion prediction method needs two-dimensional traffic flow time series data from a single detector, including horizontal and vertical time series data as inputs. Kalman filtering is used to predict the values of traffic flows using the current day and historical data separately; then the two results are assigned weight coefficients, where the weight coefficients can be generated in real time in the process of prediction. In particular, the main contributions of this article can be summarized as follows:
Two-dimensional traffic flow time series data from a single detector is considered as dynamic input data, which can enhance the efficiency of data utilization.
The idea of adaptive weights allocation is presented to improve reasonability of traffic flow forecasting results. Three adaptive weights allocation methods for short-term traffic flow prediction algorithms are proposed.
A novel two-dimensional predictive method is proposed based on the Kalman filtering theory and two-dimensional traffic flow time series data from a single detector.
The two-dimensional framework of the predictive method includes two spatial dimensions, the traffic flow and weights predictive dimensions.
This article is organized as follows. A brief general introduction to Kalman filtering used to perform traffic flow prediction is provided in section “Prediction equations of discrete-time Kalman filtering.” In section “Two-dimensional prediction method,” we propose a short-term traffic flow two-dimensional prediction method from a unique perspective to time series forecasting, using Kalman filtering and adaptive weight coefficients calculated in the process of prediction dynamically. In section “Performance of proposed method,” we demonstrate our method using extensive experimental comparison of methods and analyze results. Conclusion and future work are presented in section “Conclusion.”
Prediction equations of discrete-time Kalman filtering
Kalman filtering, also as known as an optimal recursive data processing algorithm, is an efficient algorithm for making state inference in a real-time state-space model of a linear dynamic system to achieve an optimal goal. In 1960, Kalman 29 published a famous paper, called “A New Method to Linear Filtering and Prediction Problems,” presenting a recursive solution to the discrete data linear filtering problem. Since that time, due to advances in digital computations, Kalman filtering has become widely used and was shown to be the best adaptive estimator in engineering. 30
In this article, Kalman filtering is applied to multidimensional traffic flow prediction systems due to its easy adaptation to any alteration of variables. Specifically, each new state update can be obtained based on the calculations involving the former state and new inputs. The linear dynamical procedure of the classical Kalman filtering algorithm can be described in the following.
First, we introduce a discrete system that can be presented by linear stochastic differential and measuring equations
and
where
According to the Karman filter algorithm, the expected value is calculated using propagation and measurement update equations. Based on the system state, the propagation equations using the present system state and filtering error covariance equations are as follows
where
According to equation (3), the estimated state
Combining the estimated state
where
where
In order to keep the dynamic performance of Kalman filtering, the covariance
equations (1), (2) and (4)–(6) update the Kalman filtering algorithm from period
Kalman filtering, known as a pure statistical technique, tries to find relationships between some explanatory variables and measured traffic flow data. It is worth noting that Kalman filtering is an optimal sequential estimation procedure statistically. We design a novel prediction method based on Kalman filtering. Details on the application of Kalman filtering to dynamic systems can be found in Antoniou et al., 30 Welch and Bishop, 31 and Zuluaga et al. 32
Two-dimensional prediction method
Basic datasets
In general, the traffic flows in weekday and weekend are different. The trips in weekday are mainly to reach the workplace or for business, and the trips in weekend are mainly for entertainment or relaxation. Therefore, the characteristics and peak hours of weekday and weekend are different. Based on this analysis, we treat 1 week as a cycle. In the same day of a week, the traffic flows are considered similar with each other, which is rough method to select similar traffic flow data, consistent with the concept of the research by FG Habtemichael and M Cetin. 33
In this article, we establish a self-adaptive two-dimensional forecasting method based on two-dimensional traffic flow time series data. The characteristics of data should be described first. A simple example of time series data is given in Table 1.
Traffic data sample of model input.
Table 1 shows a sample of two groups of field test data. To forecast the traffic flow at 7:41 on December 8, we should prepare traffic flow data before 7:42 on December 1, and at 7:41 on the same day (Tuesday) for past few weeks. The data in the first two columns and last two columns of Table 1 are defined as time series 1 and time series 2, respectively. The traffic flow will be predicted based on the data of time series 1 and time series 2 using a two-dimensional predictive method based on Kalman filtering.
Frame design of two-dimensional predictive method
To address the issue of instability of the current prediction system using a single source of traffic data, we design a novel adaptive prediction method where some parameters are predicted in both dimensions as time goes on.
We use the data in current day and historical days to predict traffic flow in few future minutes with dynamic weight coefficients. In first step, we can obtain two predicted traffic flow values by current day data and historical data, respectively, using Kalman filtering algorithm. To calculate the final predicted traffic flow value based on the two values obtained in first step, a fusion method is required. We introduce weight coefficients to adjust the weights of the two traffic flow values obtained in first step, which can achieve more accurate values. To calculate the fusion traffic flow value in next time, the weight coefficients are predicted using Kalman filtering algorithm. Therefore, the predictive method proposed in this article can roll to next time continually. The self-adaptive weight coefficients can change dynamically according to the fluctuations of current day data and historical data, which can handle the difference of traffic flow data in different days to some extent.
For example, to predict the traffic flow of the time
The framework of the prediction method proposed in this article is shown in Figure 1. There is a connection between the first and second dimensions. In particular, in the first dimension, two independent predictions of traffic flows based on the current day data (time series 1) and historical data (time series 2) are made separately,
where

Two-dimensional framework of predictive method.
According to equation (8), real-time weight coefficients can be obtained. In particular, in the second dimension, the estimation of the weight coefficient

Process of two-dimensional prediction method.
There are four major steps in the process of prediction of traffic flow
Step I: prepare original traffic flow data including time series 1 and time series 2. Polish the data, namely, delete singular values, fix missing values, and so on.
Step II: predict traffic flows
Step III: the real value of parameter
Step IV: the predicted traffic flow at period
As shown in Figure 2, the function of the relationship between parameter
Formulations of fusion of results
Now we need to find a convenient model for fusion of results, which is able to compute traffic flows we want to obtain. There are two requirements for the formulation of fusion. First, the fusion function should be elegant enough to be obtained in real time. Second, parameter
Fusion method 1
Due to correlation between the predicted values of traffic flows based on time series 1 and time series 2 and field data, we assume that the two weight coefficients
Based on equation (10), the relationship between the two weight coefficients
Using equations (10) and (11), the weight coefficients
In fact, in Step III shown in Figure 2, only one weight coefficient (
Now, we formulate a fusion method to express the relationship mentioned in equation (8). Furthermore, feasibility analysis of this method will be presented. First, we formulate a numerical example to test the method. We assume that the real traffic flow is 0.5 veh/min,
As shown in Figure 3, when the predicted traffic flow

Relationship between parameters
Using the numerical example again, we make some groups of

Threshold of parameter
Generally, the 95th percentile is located in one group instead of an exact value. In order to calculate the exact 95th percentiles on both positive and negative sides, we introduce the formula
where
Fusion method 2
Although the feasibility of Fusion method 1 proposed in section “Fusion method 1” is discussed, and a solution to identify abnormal points is also presented, the threshold values should be pre-settled using historical values. Therefore, there is still room to develop a fusion method in terms of convenience.
The key of Fusion method 1 to eliminate abnormal points is to calculate threshold values obtained based on statistical methods. Some functions have their own ranges. For example, arctan function converges when the independent variable tends to infinity.
Therefore, we make some transformation based on Fusion method 1. In Fusion method 2,
Based on analysis in section “Fusion method 1,” if the predicted values of traffic flows
According to equations (15) and (16), the range of
Fusion method 3
As mentioned in section “Fusion method 1,” the two weight coefficients are related. Therefore, one weight coefficient can be fixed, while the other can be variable, which means equation (11) becomes not necessary. The relationship of equation (8) can be expressed by
where the weight coefficient of
Also, the weight coefficient of
The weight coefficient of
Because the time interval of traffic flow data collection on a freeway is generally set as 5, 10, or 15 min, and the vehicle number counted is relatively high even during flat peak periods, zero traffic flow is unlikely occurred. Using Fusion method 2 proposed in this section, no values should be set in advance, implying convenience in applications.
Performance of proposed method
The traffic flow data used to evaluate the proposed two-dimensional predictive method was collected using actual measurements from an remote traffic microwave sensor (RTMS) detector, fixed on a gantry located on Jingtai Freeway (G3) in the southeast of Dezhou City. As shown in Figure 5, there are two lanes in one direction on Jingtai Freeway. The detector can monitor traffic flows 24 h, and the initial data collected by the RTMS detector was polished by the research team.

Data collection site and detector layout.
Test design
The Thursday traffic flow data from 20/10/2011 to 24/11/2011 (6 days) were selected as test group 1, and Tuesday traffic flow data from 04/01/2012 to 29/02/2012 (9 days) were selected as test group 2. Wednesday traffic flow data from 01/02/2012 to 07/03/2012 (6 days) were selected as test group 3. The raw 24-h traffic flow data were aggregated in 6-min intervals. Based on the data, the performance of the proposed predicted models was compared. In applications of Fusion method 1, threshold values should be set in advance to identify abnormal points. Also, the performance of the step predicting weight coefficients is presented.
The proposed predictive method in this article is implemented with the aid of MATLAB. First, we obtained the values of weight coefficient in method 1 based on equation (12). The singular value occurred which influence the precision of predicted values of method 1. To overcome this issue, the method 2, method 3, and the amendment of method were applied with the aid of MATLAB based on equations (14), (18), and (13), respectively. The values of weight coefficient in these methods are shown in Figure 7. Finally, the predicted traffic flow values were obtained by the frame proposed in section “Frame design of two-dimensional predictive method” with the aid of MATLAB and the Kalman filter algorithm introduced in section “Prediction equations of discrete-time Kalman filtering.”
To evaluate the performance of the predicted models, two accuracy measures are employed. The first index is the mean relative error (MRE), which indicates the expected error as a fraction of the measurement, and the second index is the root mean square error (RMSE), which penalizes large prediction errors
where
The test involves five predictive methods, namely, the original method, method 1, method 2, method 3a, and method 3b. The original method means the original Kalman filtering method used to predict traffic flows. Methods 1, 2, and 3 refer to predicting traffic flows using Fusion methods 1, 2, and 3 (section “Frame design of two-dimensional predictive method”) of the proposed two-dimensional predictive method. Regarding method 3, if the weight coefficient of
The method 1 should set initial value of threshold in advance. We used the 5 days traffic flow data to calculate the weight coefficients. According to equation (13), the 95th percentile is obtained, which is set as 3.9 in the test.
Furthermore, we select one of existing predictive method to compare the performance with the proposed method in this article. The compared method is from the research by Q Shang et al. 34 The traffic flow data are from six successive Wednesdays.
Evaluation results and analysis
When method 1 is applied to predict traffic flows, the weight coefficients should be calculated first using equation (12). The values of raw

The abnormal points of raw parameter
To eliminate abnormal points, three methods are proposed in section “Frame design of two-dimensional predictive method,” and fluctuations of

The fluctuation of parameter
Using the proposed two-dimensional predictive methods, we can predict traffic flows using the data of groups 1 and 2. The value for each relative error is given in Table 2, and the RMSE is presented in Table 3. Evidently, lower errors indicate better algorithm performance. In addition, each model is compared to the original method, where negative values refer to the proposed method being more accurate.
Comparisons of MRE for the prediction methods.
MRE: mean relative error.
Comparisons of RMSE for the prediction methods.
RMSE: root mean square error.
Based on the test results shown in Tables 2 and 3, it is clear that the proposed two-dimensional predictive methods perform better than the original method in terms of accuracy and stability. In addition, method 3 is better than methods 1 and 2 showing higher than 40% improvement in accuracy. Furthermore, method 3a performs best in traffic flow prediction having about 0.054 MRE.
In fact, the accuracy of weight coefficient prediction partly determines final results. In methods 1 and 2, there are two parameters
In addition, method 3 does not require presetting threshold values to eliminate abnormal points as in method 1, or transforming parameters using arctan function as in method 2, which means method 3 is more elegant and easier to use in applications. Therefore, the method 3 proposed in this article is the best fusion model for two-dimensional prediction method. Furthermore, performance comparison of the original Kalman filter method and proposed method 3a is shown in Figure 8.

Comparison of traffic flow predictions: (a) performance of original Kalman filter method and (b) performance of proposed method 3a.
Solid lines, in Figure 8, represent the actual traffic flow, and dotted lines refer to predicted values. The histograms present the MRE of each prediction method, which reflects accuracy. Using method 3a, almost all errors are less than 10%, being less than those using the original Kalman filter method. It is clear that method 3a performs better. Because traffic flows during early mornings and late nights are relatively small, the MREs are inevitably high, although absolute errors are at normal levels. Therefore, except off-peak periods, the maximum MRE is not larger than 20%, as shown in Figure 8(b).
Using the original Kalman filter method directly, the initial value should be set, which may bring deviations in the beginning. With the proposed method 3a, prediction values can be obtained accurately, because the introduction of traffic flow data of time series 2 can decrease errors in the beginning of prediction.
Comparison and analysis
To further evaluate the proposed method in this article, we select the predictive method proposed by Q Shang et al. 34 as the method for comparison. As shown in section “Evaluation results and analysis,” the proposed method 3a has relatively satisfactory precision in predicting the traffic flow. Therefore, in this section, we select another group traffic flow data from successive six Wednesdays. Using MATLAB again, the results of the proposed method in this article were obtained. According to the predicted method proposed by Q Shang et al., 34 the predicted results of the compared method were obtained, with necessary train with amounts of historical traffic flow data. The performance of proposed method 3a is shown in Figure 9(a), while the performance of the compared method is shown in Figure 9(b). In addition, to reflect the precision of the predictive method, the index of MRE is also presented in Figure 9.

Comparison of performance of the two methods based on the group 3 data: (a) performance of proposed method 3a and (b) performance of the compared method.
As presented in Figure 9, the solid lines represent the actual traffic flow, and dotted lines refer to predicted values. The histograms present the MRE of each prediction method, which reflects accuracy. From Figure 9(a), we can figure out that almost all errors are less than 10% using method 3a, which is consistent with the conclusion achieved in section “Evaluation results and analysis.” The index values of MRE shown in Figure 9(b) are relatively high compared with them in Figure 9(a). Therefore, we can obtain the conclusion that method 3a performs better than the compared method. More exactly, the average MRE of method 3a is 6%, while the compared method average MRE is about 9%. In addition, the method proposed in this article can be used to predict the traffic flow directly, without any train in advance, which is more flexible to implement in practice.
Conclusion
In this article, we predict short-term traffic flows based on a two-dimensional prediction method combing Kalman filtering with an adaptive weights allocation method, and using two-dimensional traffic flow time series data from a single detector. Specially, the two-dimensional prediction method involves two space dimensions: in the first dimension, we finish the first step traffic flow prediction based on the current and historical data, respectively, and in the second dimension, adaptive weights are calculated using Kalman filtering based on traffic flow field data. We propose different algorithms to calculate weights in the second dimension, since it poses great challenge. A thorough description is performed regarding the three computing methods of adaptive weights allocation for two forecasting results in the first dimension, and their characteristics are discussed. In particular, the two-dimensional prediction method was applied for an RTMS detector data, fixed on a gantry located on Jingtai Freeway (G3) in the southeast of Dezhou City.
We test the prediction ability of the two-dimensional prediction method. The performance of the method is studied using standard accuracy measures found in the literature, namely, the MRE and RMSE. The results show that the two-dimensional prediction method embedding three adaptive weights allocation methods is capable to provide reasonable prediction with respect to model direct outputs. Specifically, the prediction method based on Fusion method 3 used for adaptive weights allocation presents significant forecast improvement increasing accuracy by about 40%, compared to other methods. Also, the proposed predictive method performs better than one existing method with low level errors. Good results of the two-dimensional prediction method can be achieved by taking into consideration the multidimensional feature of single source data and adaptive weights allocation.
Future work will consider establishment of multidimensional traffic prediction methods based on multidimensional data from a single detector as input, in order to further enhance prediction accuracy. In addition, we will attempt to replace the base prediction model, which is Kalman filtering, with other prediction models to test universal applicability of our forecasting idea and framework.
