design that should dynamically adapt to the data
format, in particular scaling up and down. ITS and
associated applications aim at adding value to
collected data. Adding value to big data depends on
the events they represent and the type of processing
operations applied for extracting such value (i.e.,
stochastic, probabilistic, regular or random). Adding
value to data, given the degree of volume and
variety, can require important computing, storage
and memory resources. Value can be related to
quality of big data (veracity) concerning (1) data
consistency related to its associated statistical
reliability; (2) data provenance and trust defined by
data origin, collection and processing methods,
including trusted infrastructure and facility.
Processing and managing big data, given the
volume and veracity and given the greedy
algorithms that are sometimes applied to it, for
example, giving value and making it useful for
applications, requires enabling infrastructures. Cloud
architectures provide unlimited resources that can
support big data management and exploitation. The
essential characteristics of the cloud lie in on-
demand self-service, broad network access, resource
pooling, rapid elasticity and measured services
(Grance 2008). These characteristics make it
possible to design and implement services to deal
with big data management and exploitation using
cloud resources to support applications such as ITS.
The objective of our work is to manage and
aggregate cloud services for managing big data and
assist decision making for transport systems. Thus
this paper presents our approach for developing data
storage, data cleaning and data integration services
to make an efficient decision support system. Our
services will implement algorithms and strategies
that consume storage and computing resources of the
cloud. For this reason, appropriate consumption
models will guide their use.
The remainder of the paper is organized as
follows. Section 2 describes work related to ours.
Section 3 introduces our approach for managing
transport big data on the cloud for supporting
intelligent transport systems applications. Section 4
presents a case study of the application that validates
our approach. Finally, Section 5 concludes the paper
and discusses future work.
2 RELATED WORK
This section focus on big data transport projects,
namely to optimize taxi usage, and on big data
infrastructures and applications for transport data
events.
Transdec (Demiryurek et al. 2010) is a project of
the University of California to create a big data
infrastructure adapted to transport. It’s built on three
tiers comparable to the MVC (Model, View,
Controller) model for transport data. The
presentation tier, based on Google
TM
Map, provides
an interface to create the queries and expose the
result, the query interface provides standard queries
for the presentation tier and a data tier is
spatiotemporal database built with sensor data and
traffic data. This work provides an interesting query
system taking into account the dynamic nature of
town data and providing time relevant results in real-
time. Urban insight (Artikis et al. 2013) is a
European project studying European town planning.
In Dublin they are working event detection through
big data, in particular on an accident detection
system using video stream for CCTV (Closed
Circuit Television) and crowdsourcing. Using data
analysis they detect anomalies in the traffic and
identify if it’s an accident or not. When there is an
ambiguity they rely on crowdsourcing to get further
information. The RITA (Thompson et al. 2014)
project in the United States is trying to identify new
sources of data provided by connected infrastructure
and connected vehicles. They work to propose more
data sources usable for transport analysis. (Jian et al.
2008) propose a service-oriented model to
encompass the data heterogeneity of several Chinese
towns. Each town maintains its data and a service
that allows other towns to understand their data.
These services are aggregated to provide a global
data sharing service. These papers propose
methodologies to acknowledge data veracity and
integrate heterogeneous data into one query system.
An interesting line to work on would be to produce
predictions based on this data to build interesting
decision support systems.
(Jagadish et al. 2014) propose a big data
infrastructure based on five steps: data acquisition,
data cleaning and information extraction, data
integration and aggregation, big data analysis and
data interpretation. (Chen et al. 2014) use Hadoop-
gis to get information on demographic composition
and health from spatial data. (Lin & Ryaboy 2013)
present their experience on twitter to extract
information from log information. They concluded
that an efficient big data infrastructure is a balancing
speed of development, ease of analysis, flexibility
and scalability. Proposing a big data infrastructure
on the cloud will make developing big data
infrastructures more accessible to small businesses
VEHITS2015-InternationalConferenceonVehicleTechnologyandIntelligentTransportSystems
108