3. Feeding the current state of the network
along with its traffic history to some type of
model that predicts travel time if a trip will
start from some point and end in another in
the network at some specific time.
Designing such a model is challenging, as
is finding a set of current or historical
parameters with real prediction power. The
most useful model may be road dependent,
and even for a single road, it has been
shown that different models may describe
the traffic behavior more accurately at
different traffic conditions. For instance,
one model may be more useful when the
road is congested, while another model may
be more accurate when vehicles are flowing
freely, etc.
In short, accurate traffic time prediction is
challenging due to the high cost of sensing and
collecting enough useful current and historical
traffic data. Even when such data is available, it is
still difficult to determine which type of model best
describes the traffic behavior, and which traffic
parameters should be fed to the model for the best
predictions. Moreover, the best course of action may
be to use two or more models and switch between
them depending on current traffic conditions. This
option adds a new challenge, as it is necessary to
decide which model from the set of models will be
used for some specific input data, or whether
different models will be used for prediction with
some weight applied to each output prediction to
reach a final travel time prediction.
In this paper, a new method for travel time
prediction is proposed. This method uses a mixture
of linear regressions motivated by the fact that travel
time distribution is not unimodal, since two modes
or regimes of traffic can exist—one at congestion
state, and the other at free-flow state. The proposed
model was built and tested using probe data
provided by INRIX and supplemented with
traditional road sensor data as well as mobile
devices and other sources. The dataset was collected
from a freeway stretch of I-66 eastbound connecting
I-81 and Washington, D.C. The traffic on this stretch
is often extremely heavy, which makes travel time
prediction more challenging, but also makes the data
more valuable and helps create a more realistic
model.
2 RELATED WORK
Various methods and algorithms have been proposed
in the literature for travel time prediction. These
methods can roughly be classified into two main
categories: statistical-based data-driven methods and
simulation-based methods. This section focuses on
the statistical-based methods since the proposed
solution in this paper falls under this class of
methods, and because more research in the literature
uses statistical methods.
Several researchers fit different regression
models to predict travel time. A typical approach is
to fit a multiple linear regression (MLR) model
using explanatory variables representing
instantaneous traffic state and historical traffic data,
as, for example, (Rice and van Zwet, 2004, Zhang
and Rice, 2003) . The model proposed in (1) was
even able to use a single linear regression (SLR) to
successfully provide acceptable travel time
predictions. Some researchers developed hybrid
methods where a regression model was used in
conjunction with other advanced statistical methods.
For example, (Kwon et al., 2000) used regression
with statistical tree methods. Another approach
(Chakroborty and Kikuchi, 2004) proposed an SLR
model using bus travel time to predict automobile
travel time.
Regression models are generally powerful in
predicting travel time for short-term prediction,
whereas long-term predictions are less accurate.
Regression models are also reported to be more
suitable for use in free-flow rather than congested
traffic, and fail to accurately predict when incidents
have occurred (Guin et al., 2013).
The idea of using a mixture models for different
traffic regimes has also previously been explored
(Guo et al., 2012). The model developed in this
paper attempts to overcome the drawbacks of
previous work that used mixture models of two or
three components to model travel time reliability,
which suffer from the following limitations:
1. The mean of each component is not
modeled as a function of the available
predictors.
2. The proportion variable is fixed at each
time slot, which limits the model’s
flexibility.
3. Information provided given the time slot of
the day is the probability of each
component (fixed) and the 90th percentile.
Another class of statistical-based methods in
literature uses time series models for travel time
prediction, using, for example, auto-regressive
prediction models (Oda, 1990, Iwasaki and Shirao,
1996, D'Angelo et al., 1999), multivariate time series
models (Al-Deek et al., 1998), and the auto-
VEHITS 2018 - 4th International Conference on Vehicle Technology and Intelligent Transport Systems
114