transportation systems and poor performance (Gurmu
and Fan, 2014) comparing with other methods.
Machine learning methods, such as support vec-
tor machines (SVM) and artificial neural networks
(ANN), are also widely used in travel time predic-
tions. ANN is based on principles of biological neu-
ron networks and was introduced by (McCulloch and
Pitts, 1990). As stated by (Fan and Gurmu, 2015),
ANN are suitable for prediction tasks even when
the physical processes related with the route are not
clearly specified. SVM is generally based on two
ideas: feature vector mapping in a nonlinear way
and finding a hyperplane that separates data (Kulka-
rni and Harman, 2011). SVM deals with classification
and regression, and as stated by (Altinkaya and Zon-
tul, 2013), there are few applications that uses SVM
method in the field of transportation. As indicated by
(Julio et al., 2016), SVR algorithm has shown its po-
tential in transportation as being an accurate predictor.
In (Amita et al., 2015) authors compared two
models - artificial neural network model and multi-
linear regression model. Different parameters like
dwell time, delays, distance between stops were used
as input data for both models. The results of this study
showed that artificial neural network is more accurate
and robust. Study carried out by (Fan and Gurmu,
2015), proved ANN as a better prediction model than
models based on historical average and Kalman filter.
It was also stated that, acceptable predictions can be
obtained by using only arrival and departure time.
A combination of SVM and genetic algorithms
was used to predict bus arrival time by (Yang et al.,
2016). During the study, it was concluded that their
proposed method was more accurate than traditional
SVM and ANN. An interesting approach was intro-
duced by (Julio et al., 2016), where machine learn-
ing algorithms, like SVM, ANN and Bayes Networks,
were compared to predict bus travel speeds, by using
GPS data. Results proved that ANN performed bet-
ther than other selected methods.
GPS is one of the technologies that are used in
a huge number of applications today (Gowtham and
Mehdi, 2016). Many researchers stated in their study
that GPS could be used in many applications and it is
possible to follow routes and locations driven a vehi-
cle by means of GPS (Verma and Bhatia, 2013). This
paper focuses on linear regresion model and SVR ap-
plication to predict bus arrival times or to be more
specific - how both models perform with given lim-
ited data set (GPS data of bus location and bus driven
distance).
Authors choose this topic because in their native
city Jelgava there is a need for a smart public bus sys-
tem development. Jelgava is the fourth largest city in
Latvia, a historical centre of Zemgales region; dis-
tance from Latvia capital Riga is 42 km, residents
count is approx. 62 000. Jelgava is called ”students”
city, because there are a lot of students from other
cities, which makes a ”real” number of people liv-
ing in Jelgava much larger. Despite the fact that Jel-
gava is a small city and there should be no problem
to organise qualitative public transportation system,
there are some issues. In Jelgava there is one public
transport provider - Jelgavas autobusu parks (source:
http://www.jap.lv/). There are 20 bus routes in the
city. Buses are scheduled by static time schedule
defining at what time bus should depart the bus stops.
And there is one main issue, that sometimes bus can
be delayed or can depart earlier than scheduled time.
Citizens waiting at the bus stops have no idea about
the actual bus location. So citizens are in need of a
smarter and user friendlier public transportation noti-
fication system. Using the real-time GPS data from
the buses can help to solve above mentioned prob-
lems.
In this paper authors compare two prediction mod-
els in order to predict bus arrival time using the GPS
data with 30 second interval, that is considered as
limited data set (there was no information about ar-
rival/departure or dwell times). Example GPS data
is used for one bus route in Santiago, Chile (Cort
´
es
et al., 2011).
2 MATERIAL AND METHODS
2.1 Data Preparation
Data available for the research were GPS coordinates
(latitude and longitude) of bus stop locations and his-
torical GPS coordinates of bus (and its driven dis-
tance) during its route in Santiago, Chile, that were
recorded every 30 seconds. There were no additional
information gathered, like delays during route, ar-
rival/departure time, passenger count or dwell time at
bus stops.
To make the models more robust, it was con-
sidered to introduce additional parameter, that could
have some impact on arrival times. One such parame-
ter could be obtain by observations of a route. There-
fore, historical data of the specific route was observed
and approximate delays were assumed based on inter-
ceptions, turns etc. during the route.
For model evaluation purposes, the actual arrival
time needs to be known. Data for actual travel time
were also based on manual observations in a map,
since there was no information about the exact arrival
RESIST 2018 - Special Session on Resilient Smart city Transportation
644