1. 套话
- 结论性 This research provides strong evidence suggesting that
- 得到良好结果 provides promising results
- 良好的实用性 it can be easily incorporated with real-time traffic control for proactive freeway traffic management.
- 最原始的目标 The prime objective of traffic management strategies is to
- 局限于 has been limited to
- 随着科技进步 with the advancement in technology and the wide deployment of intelligent transportation systems
- 该领域已有大量的研究 , considerable amount of research has been focused on the topic
- 提供方法解决问题 In this paper, we present a methodology for short-term traffic forecast based on learning similar traffic patterns as a reference for providing the predictions on future traffic.
2. 短时交通流
2.1 预测
- 短时交通流预测 short-term traffic flow rate forecasting
- 短时交通预测 short-term traffic forecasting
- 及时可靠准确的预测 timely, reliably, and accurately
- 交通管理和控制应用 traffic management and control applications
- 动态选择交通管理策略 dynamically apply alternative strategies proactively in response to predicted traffic conditions.
- 短时交通流预测的必要性 Short-term traffic forecasting models, therefore, are an integral element of the toolset needed for real-time traffic control and management. Moreover, such tools are important in providing travelers with reliable travel time information, optimizing traffic signals, and deployment of emergency management systems.
- 提前预测交通量 predicting the expected volume of traffic ahead of time
- 短时交通流预测的原理 The availability of a vast amount of spatial and temporal traffic data coupled with advancements in statistics and data analysis techniques have created an opportunity to perform short-term traffic forecast with a reasonable prediction accuracy and short processing time.
- 短时交通流预测的定义 Short-term traffic forecast aims at predicting the evolution of traffic over time horizons ranging from few seconds to few hours
- 短时交通流预测的应用领域 This interest is the direct result of an increasing need for developing user friendly applications which can both provide accurate information to drivers and be used for signal optimization.
- 交通管理部门 transportation agencies
- 交通管理与控制 traffic management and control application
- 时间粒度 data resolution (e.g. 5mins data)
- 交通拥堵 traffic congestion
- 在线应用 online implementation
- 高速公路 freeway
2.2 相似交通流识别
- 识别相似交通模式 identifying similar traffic pattern
- 交通流模式识别 traffic flow pattern analysis
- Identifying spatial and temporal flow patterns has been an important consideration in short-term traffic forecasting research.
- 在线交通流状态识别 Online traffic state identification
聚类:
- An optimal fit of the statistical characteristics is provided by maximizing the intra-cluster data point distances and minimizing inter-cluster data point distances through a joint utilization of the Bayesian Information Criterion and the ratio of change of the dispersion measurements.
3. 文献综述
- 短时交通流预测算法分成四类 the approaches used in short-term traffic forecast can be broadly classified into four categories: Naïve, parametric, non-parametric, and hybrid. 四类模型的内涵以及优缺点见18.
- 非参数方法 non-parametric approaches are mostly data-driven and apply empirical algorithms to provide the predictions
- 非参数方法的优点 Such approaches are advantageous as they are free of any assumptions regarding the underlying model formulation and the uncertainty involved in estimating the model parameters. 非参数方法不是模型完全没有参数,而是不涉及公式里需要估计的参数。
- 过去研究的缺陷 The main limitations of the works of these authors are that the simplest form of K-NN was used. In addition, majority of these authors failed to compare their results with parametric models
- 得出结论的困难性 In summary, considering the fact that there are a wide variety of models developed for short-term traffic forecasts, it can be overwhelmingly difficult to identify the one that is most suitable for specific traffic conditions. Moreover, since the studies were conducted under different settings (such as traffic parameters considered, the nature of datasets and their sources, aggregation intervals, forecast step durations, prediction horizons and performance measures considered), it is very difficult to fairly compare one approach over another.
- 非参数方法更好 suggested that in the context of traffic forecast, nonparametric approach is the first choice as the input and output traffic variables are noisy and the relationship between each other is nonlinear and poorly understood.
4. 模型
4.1 数据描述与预处理
- 缺失值 missing values
- 描述时间间隔 predictions are provided over a 15-min step with a forecast horizon of 6 steps (i.e., predictions were made for one hour and thirtyminutes at 15-min intervals).
- 12 steps ahead at 15-min intervals
- 45 steps ahead at 20-s interval.
4.2 平滑
- 数据粒度越小,数据噪声越大。The higher the data resolution (e.g. 30 s data), the larger the portion of noise of the time series of the traffic variables
- 被LOESS技术过滤(平滑) are slightly filtered using locally estimated scatter-plot smoothing (loess) technique.
- Several approaches have been utilized for reducing noise from time series before proceeding with predictions; these range from simple smoothing, to wavelets and fuzzy algorithms (Jiang and Adeli, 2004; Boto-Giralda et al., 2010).
4.3 KNN
- non-parametric and data-driven methodology
- enhanced K-nearest neighbor (K-NN) algorithm
- 相似性度量 similarity measure
- 多步预测 multiple forecast steps
- 性能检验 its performance is tested under …
- 算法原理概述 Similar traffic profiles are identified using an enhanced K-nearest neighbor algorithm based on measurements of a sequence of volume of traffic at 15-min intervals.
- KNN算法原理 For a given prevailing volume profile, K similar profiles (nearest neighbors or candidates) are identified from a large collection of a historic traffic database. The neighboring candidates drawn from the historic database corresponding to the desired forecast time or horizon are aggregated to provide predictions of future flow rate measurements.
- 相似性度量方法识别相似交通流 two similarity measures are considered to determine similar traffic patterns
- 流量的滞后项部分 the lagging volume profiles
- 该算法的主要优点:
- The main advantage of this approach is that the predictions are generated based on the observed historical traffic patterns that are discovered from the historical datasets.
- this approach is capable of providing predictions over multiple time steps or a trace forecast over a specified prediction horizon(预测时间窗).
- 良好的模型部署能力 Considering the simplicity of the proposed approach and a relatively short computation time, it can be easily incorporated with online traffic management strategies to provide short-term traffic forecasts in real-time.
4.4 预测评价
- MAE mean absolute percent error
- 性能检验 its performance is tested under …
- 被多次验证过并且和其他方法比较过 The performance of the proposed approach is tested using a wide variety of freeway flow rate datasets collected from different regions and is compared with the works of Guo et al. (2014) which developed several advanced parametric models based on time series analysis reinforced with an adaptive Kalman filter.
- 模型的效果更好 The proposed methodology is found to outperform the works of Guo et al. (2014) in terms of forecast accuracy; provided that enough samples are available in the archived datasets as explained in the paper.
- 优化参数和模型的鲁棒性 the level of details presented in terms of optimizing the parameters of the enhanced K-NN and testing the robustness of the model under different traffic scenarios corresponding to different datasets collected from different regions and different proportion of missing values is unprecedented.
5. 算法
5.1 算法原理
forecasting is performed by exploiting similarities in traffic patterns. In essence, the future is predicted based on similar profiles from the archived data.
- 图形展示 This is graphically shown in Fig. 1. shown in solid red line in Fig. 1
5.2 几个名词定义:
- Subject profile: The traffic profile corresponding to a specific day for which the forecast is desired to be made
- Candidate profiles: Also called nearest neighbors, these are set of traffic profiles with similar pattern to the subject profile selected from a pool of archived dataset. The number of such candidate profiles considered is referred to as K.
- Lag duration: This is the time window considered to determine similarity between the subject and candidate profiles. The number of the sequence of traffic measurements considered in the lag duration is represented by m.
- Forecast duration
- Lagging part of traffic profile: This is the part of traffic profile used for determining similarity of traffic patterns.
- Size of historic profile: This refers to the magnitude of the archived data, or the search space, from which the candidate profiles are drawn
5.3 算法描述:
- The algorithm identifies the K most similar historic sequences with similar pattern to the one being examined. The combination of the nearest values corresponding to the time step where forecast is desired to be made will be the expected future value of the sequence being examined.
5.4 算法理念:
- The notion of K-NN-based forecast is that the pattern of the sequence of observations is repeated over time. Therefore, if a previous pattern can be identified to be similar to the current pattern, then the subsequent values of the previous sequence can be used to predict the future values of the target sequence.
5.5 算法的几个关键问题:
In the process of K-NN-based identification of similar traffic conditions and using them as a basis for providing traffic predictions, a few questions need to be answered to specify the required K-NN variables so that the prediction errors are minimal. These include:
- how to identify similar traffic patterns,
- how large should the lag duration be,
- how many nearest neighbors (candidates) should be considered and what
- mechanism of aggregation should be used to provide an estimate of the future traffic flow rate.