5 RELATED WORK
To provide reliable resources for planning involving
moving objects, methodologies were developed to
predict the movement dynamics in uncertain
contexts. In fact, by moving along road networks
(specially in urban zones), mobile users usually have
no idea about how many cars are moving with them,
where they come from and where they are going.
However, these users are free to choose another
route (unless it is mandatory, such as on the buses)
to try to find the shortest time solution.
Raw locations tracked by GPS receptors have
been used for controlled applications such as buses
and trucks private fleets. Masiero et al. (2011)
present a methodology based on Support Vector
Regression (SVR) to predict the travel time for
delivery trucks based on previous trajectories.
Sinn et al. (2012) describe another application
for time travel prediction from GPS points. In
addition, they present a method to automatically
extract bus routes, stops and schedules. In all these
cases, the analysis considered fixed trajectories
(stops and moves) and controlled speeds. Pang et al
(2011) proposed another methodology for time
travel prediction based in GPS data on buses.
However they use smart phones to gather data for
the analysis. In addition, they present a method to
automatically extract the bus routes, the stops and
the schedules. In all of these three cases, the analysis
considered fixed trajectories (stops and moves) and
controlled speeds. Hence, the tracked data is not
representative to model the global average speed for
a road network.
The method presented in Min and Wynter (2011)
is based on spatial-correlation matrices and average
speeds obtained from historical data of some
categories of roads and provides predictions of speed
and volume over 5-min intervals for up to 1 h in
advance.
The analysis presented by Yuan et al. (2011) is
based on GPS data relative to three months of GPS
trajectories collected from 33,000 taxis in Beijing to
detect anomalies on traffic behavior. Although taxis
trajectories are supposed to be more flexible, they
are influenced by the existence of either permanent
or temporary points of interest such as touristic
places, airports, hotels or convention centers.
On the other hand, in Biagioni et al. (2011), the
taxis drivers' intelligence in choosing faster routes is
modeled by analyzing the trajectories they usually
take. In this case, the traversing frequencies along
the road network are considered instead of speeds.
Therefore, this method ranks the streets by the
drivers’ preferences (as consequence of their
previous experiences).
Letchner et al. (2006) present a method that
considers the previous individual history (i. e., the
user’s preferences) to indicate routes for general
users (instead of taxi drivers).
Our contribution is the generation of more
representative statistics based on the actual behavior
of non-specific groups of drivers or categories of
roads.
6 CONCLUSIONS
We proposed a methodology to enrich a road
network database with statistics about the actual
speeds, based on the analysis of raw trajectories
tracked by usual vehicles during one month. These
results reflect how traffic flow behaves along the
days of the week and the hours of each day of week
– although the methodology allows different time
intervals. Moreover, they will support movement
planning by proposing routes based on the estimated
travel time instead of the travel length.
The method is based on three steps: (1) map-
matching (2) temporal partition of GPS points and
(3) statistics computation and road segment
enrichment. Because of inaccuracies on GPS
positioning and off-roads points, we limited the
analysis to the points tracked near the roads – the
buffer zones, which width must be compatible to the
real width of the respective road.
The combined analysis of tables 1 and 2 shows
that, by enlarging the buffer zones, the gain in the
number of oriented points is limited. Furthermore,
among these points, the ratio of outliers increases
fast. The direction analysis detected outliers, even by
reducing the size of the sample of GPS points.
Despite the mismatches, the number of one-way
roads with enriched data increases because most of
the additional mismatches occurred just in a few
roads.
Atypical behavior can also be detected. In these
cases, some observations must be discarded to keep
the statistics meaningful.
We emphasize that many of the computed
statistics considered too few points for each time
interval. By considering 3-meter wide buffer zones,
82% of the records are computed based on less than
10 points. The ratio for records, such as these, in the
5- and 8-meter wide buffer zones respectively are,
79% and 76% (we do not consider this a
representative gain). To increase this percentage, the
methodology must be improved to consider more
ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems
496