Traffic Flow Prediction from Loop Counter Sensor Data using

Machine Learning Methods

Blaž Kažič, Dunja Mladenić and Aljaž Košmerlj

Artificial Intelligence Laboratory, Jožef Stefan Institute, Jamova cesta 39, 1000 Ljubljana, Slovenia

Keywords: Time Series, Traffic Flow Prediction, Feature Engineering, Ridge Regression, SVR, Random Forests.

Abstract: Due to increasing demand and growing cities, traffic prediction has been a topic of interest for many

researchers for the past few decades. The availability of large amounts of traffic-related data and the

emerging field of machine learning algorithms has led to a significant leap towards data-driven methods. In

this paper, loop counter data are used to develop models that can predict traffic flow for several different

prediction intervals into the future. In depth exploratory data analysis and statistical testing is performed to

obtain good quality informative features. Several feature sets were compared by using different machine

learning methods: Ridge Regression, SVR and Random Forests. The results show that in order to obtain

good prediction results thorough feature extraction is just as or even more important than learning method

selection. Good features enables us to use less complex methods, which run faster, are more reliable and

easier to maintain. In conclusion, we address ideas regarding how predictions could be improved even

further.

1 INTRODUCTION

Traffic congestion can have substantial effects on

quality of life, especially in bigger cities. It is

estimated that traffic congestion in United States

causes two billion gallons of fuel to be wasted every

year; 135 million US drivers spend two billion hours

stuck in traffic every year. Altogether, 100 billion

USD are spent because of fuel in the US alone. For

an average American driver, this costs 800 USD per

year (Liu et al., 2006). In addition to economic

aspect (wasting money and time), there is also an

ecological one. Pollution could be reduced

significantly by reducing travel time and thus

emissions.

The above mentioned facts are the main reasons

that governments are investing in Intelligent

Transportation Systems (ITS) technologies that

would lead to more efficient use of transportation

networks. Traffic prediction models have become a

main component of most ITS. Accurate real time

information and traffic flow prediction are crucial

components of such systems. ITS vary in

technologies applied in; from advanced travel

information systems, variable message signs, traffic

signal control systems, to special user-friendly

applications, such as travel advisors. The aim of all

of these technologies is the same, to ease traffic

flow, reduce traffic congestion and decrease travel

time by advising drivers about their routes, time of

departure, or even type of transportation

(Stathopoulos and Karlaftis, 2003).

The availability of large amounts of traffic-

related data, collected from a variety of sources and

emerging field of sophisticated machine learning

algorithms, has led to significant leap from

analytical modelling to data driven modelling

approaches (Zhang et al. 2011). The main concept of

this paper is to investigate different machine

learning algorithms and engineer features that would

enable us predicting traffic state several prediction

intervals into the future.

2 RELATED WORK

In general, traffic prediction studies can be

categorized into three major categories: naïve

methods, parametric methods and non-parametric

methods (Van Lint and Van Hinsbergen, 2012).

Naïve methods are usually simple non-model

baseline predictors, which can sometimes return

good results. Parametric models are based on traffic

flow theory and are researched separately and in

119

Kažic B., Mladeni

c D. and Košmerlj A..

Trafﬁc Flow Prediction from Loop Counter Sensor Data using Machine Learning Methods.

DOI: 10.5220/0005495001190127

In Proceedings of the 1st International Conference on Vehicle Technology and Intelligent Transport Systems (VEHITS-2015), pages 119-127

ISBN: 978-989-758-109-0

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

parallel to non-parametric, more data-driven

machine learning methods.

A strong movement towards non-parametric

methods can be observed in the recent years,

probably because of increased data availability, the

progress of computational power and the

development of more sophisticated algorithms

(Vlahogianni et al., 2014). Non-parametric does not

mean models are without parameters; but refers to

model’s parameters, which are flexible and not fixed

in advance. The model’s structure as well as model

parameters are derived from data. One significant

advantage of this approach is that less domain

knowledge is required in comparison to parametric

methods, but also more data is required to determine

a model. This also implies that successful

implementation of data-driven models is highly

correlated to the quality of available data.

Non-parametric methods can be further

subdivided into two subgroups: classical statistical

regression approaches and data-driven machine

learning approaches (Karlaftis and Vlahogianni,

2011). From the group of statistical methods, the

local linear regression algorithm yields surprisingly

good results, especially on highway data (Rice and

van Zwet, 2004). In contrast, traffic in urban areas

can be much more dynamic and non-linear, mainly

because of the presence of many intersections and

traffic signs. In such environments, data-driven

machine-learning approaches, such as neural

networks (Van Hinsenberg et al., 2007) and SVR

(Vanajakshi and Rilett, 2007), can be more

appropriate, due to their ability to model highly

nonlinear relationships and dynamic processes. In

this research, we test methods from both groups:

statistical regression approaches and more complex

nonlinear algorithm.

3 DATA

The data used in this research is collected by a single

traffic counter sensor, installed inside the bypass of

Ljubljana, the capital city of Slovenia.

Measurements are available via a web API service

(http://opendata.si/promet/), as a real-time data

stream, refreshed every 5 minutes. Our collected

database consists of 6-month record set of sensor

data (from January 2014, to July 2014), amounting

to 46,447 records.

Every record consists of a timestamp, descriptive

information about the sensor (location, region,

direction, etc.), and the five different measurements

used in this research:

 Flow: the number of vehicles passing certain

reference point, per hour.

 Gap: average time gap (in seconds) between

vehicles, per hour.

 Occupancy: occupancy of the road in the

value of 1/10th of one percent, e.g. value 57

from the data converts to 5.7 %.

 Speed: average speed (km/h) of vehicles, per

hour. The speed of every vehicle is almost

impossible to track and is, therefore, estimated

from other parameters.

 Traffic status: Numeric status of the traffic. 1

being “normal traffic”, and 5 being “heavy

traffic with congestion”.

Figure 1: Traffic loop sensor data for randomly selected week.

VEHITS2015-InternationalConferenceonVehicleTechnologyandIntelligentTransportSystems

120

3.1 Data Preprocessing

Upon the first audit of the collected measurements,

we determined that many records were duplicated

during the 5-minute intervals. Since the timestamp

was duplicated as well, we can assume these are

actually missing values, probably due to service

downtime.

With further sanity checking of data, it became clear

that several entries also contain clearly wrong or

invalid values. For example, we know that the

“TrafficStatus” parameter can have only values

between 1 and 5, but in our data, some values was 0.

After checking this records’ data, we have

determined that other measurements for this record

had been corrupted as well; therefore, all such

records were marked as invalid samples.

Missing data and records flagged as invalid data,

were handled with partial listwise deletion approach.

There are two reasons for this choice. First, since the

gaps in our missing data can be very long (from a

couple of hours to a couple of days), it is not trivial

to replace missing values, and we could induce too

much uncertainty by replacing missing records.

Second, after cleaning data with partial deletion

method, we have reduced the number of records in

the data set by 37%, but we are still left with a

reasonable large amount of data (29,215 samples) to

train and test our prediction models, on a much

cleaner and more representative data set.

3.2 Exploratory Data Analysis

By visually exploring the data of one randomly

selected week from our dataset (Figure 1), we can

distinguish daily patterns over the week. The

difference between traffic by day and traffic by night

is clearly seen. Another interesting observation is

that working days during the week have highly

similar pattern, while patterns for the weekend are

different. A distinguished peak in the morning

(morning rush hour), followed by another peak

(afternoon rush hour), can also be observed.

Another interesting finding that can be observed

from the graph below is the anomaly seen on

Wednesday during the afternoon rush hour. We can

observe a drop in the flow and speed parameters,

which means that the speed and traffic flow were

unusually low for that time at the day. Furthermore,

we can see that occupancy was very high at that time

and that the traffic status changed from status 4 to 5.

This is most probably due to traffic congestion

(traffic jam or accident). This is also the most

informative type of information to predict.

3.2.1 Traffic Flow during the Week

In the previous chapter, it was determined that traffic

during the weekend is significantly different than on

weekdays, which is intuitively understood to be true.

This implies that it would be useful to include this

information (whether it is weekend or weekday) in

the form of a new feature, when training our model.

What about traffic during the days in the week?

Would it also be useful to have the day of the week

as a feature?

In order to answer this question, we performed a

statistical T-test, with the assumption (null

hypothesis) that traffic flow during weekdays is the

same. Since the traffic flow parameter is non-

normally distributed, we performed Man-Whitney

U-test in order to test our null hypothesis. The p-

value, which determines statistical significance of a

hypothesis, was computed between all combinations

of days in the week. A small p-value (smaller than

the critical value, usually 0.05) means that we can

reject the null hypothesis; otherwise we cannot reject

it.

The results are presented in the form of a heat

map in Figure 2. We can see that the only

combinations for which we cannot reject our null

hypothesis (p-value > 0.05) are the days from

Tuesday to Thursday. However, we can reject our

null hypothesis for all other days, meaning that

traffic is significantly different for these days.

According to these results, it would make sense to

also include the day of the week as a feature in to

our model.

Figure 2: P-values from Man-Whitney test.

TrafficFlowPredictionfromLoopCounterSensorDatausingMachineLearningMethods

121

3.2.2 Average Traffic Flow

Another assumption, obtained from the data

analysis, is that the traffic state on a certain day and

time is similar to traffic one week ago at the same

time. From the historical data, we have calculated

the averaged traffic status for our target variable (i.e.

flow) according to day and time over one week. In

the Figure 3, we can see the result and a comparison

between these data and one randomly selected

weekly data. It can clearly be seen that average

traffic is already a very good fit. In fact, this simple

naïve predictor is still widely used in various

practical applications, such as routing or travel time

estimation (Van Lint and Van Hinsbergen, 2012).

We will use this information as a feature when

training models and as a baseline predictor when

comparing different methods.

Figure 3: Traffic flow average for one week.

3.3 Feature Engineering

It is a well-known fact that feature engineering is a

key to success in applied machine learning. Feature

engineering is the process of transforming raw data,

into features that better represent the underlying

problem to the predictive models. Better the

features, better the results. The best way to extract

good representative features, is by exploring and

understanding, which is what we did in previous

section.

We have already discovered that there exists a

relationship between datetime and other attributes.

Since it can be difficult for a model to take

advantage of datetime native form (2014-01-01

01:00:00), we want to decompose a datetime into

consistent parts, that may be more informative for

the model. We introduce four new features:

 HourOfDay [0 - 23]

 DayOfWeek [0 - 6]

 Month [0 -12]

 Weekday/Weekend [0 - 1]

From previous empirical experiences, it can also

be beneficial for some models, to convert these

ordinal variables into dummy/indicator (categorical)

variables (i.e. “DayOfWeek” feature 3 is

transformed to 0001000).

Another important information, derived from

outcomes of previous chapter, is weekly traffic is

very similar. Therefore, traffic average (according to

time and day of week) can represent normal traffic

surprisingly well. This information can be used as

feature, that informs what is the traffic status on

average, on specific time in the future. For this

purpose we create extra feature:

 AvrFlow

Although good feature engineering is very

important in order to achieve high-quality prediction

results, the actual success is a combination between

the model that we choose, the data, and the extracted

features that we use. In order to obtain the best

combination, 4 different datasets were created

(which contains which features can be seen in Table

Table 1: Features and datasets used in further analysis

(white circles indicates dummy features): (1)

only_measurements, (2) with_datetime, (3)

dummy_datetime, (4) with_avr_dummy.

Features Data set

1 2 3 4

Flow ● ● ● ●

Gap ● ● ● ●

Occupancy ● ● ● ●

Speed ● ● ● ●

TrafficStatus ● ● ● ●

HourOfDay ● ○ ○

DayOfWeek ● ○ ○

Month ● ○ ○

Weekday ● ○ ○

AvrFlow ●

4 METHODS OF PREDICTION

In this research, we have tested three different data

driven methods, from a well-known machine

learning library scikit-learn (version 0.15)

(Pedregosa et al., 2011):

 Ridge Regression: Computationally non-

demanding and fast method; can perform

surprisingly well, with proper feature

engineering.

 SVR: Regression Support Vector Machine is

sophisticated non-linear machine-learning.

VEHITS2015-InternationalConferenceonVehicleTechnologyandIntelligentTransportSystems

122

Figure 4: Feature importances for different data sets. We can observe how additional features change importances.

 method, but can be computationally intense

and slow.

 Random Forests: An ensemble method that

operates by constructing multitude of

decision trees. Usually needs some

parameter tuning to avoid over-fitting to

their training data.

In order to train models for comparison purposes,

we also need a target value, for which we want to

predict values. Since traffic flow is the most

informative attribute, in this research we used traffic

flow as a target attribute. We also need to specify the

prediction interval. For most part of this research, 5

hour interval was used. Target values were obtained,

by copying target attribute time series data, and

lagging it according to selected prediction interval

(in this case, 5 hours into the future). However, if

there is a need to predict any other attribute from the

database, or change the prediction interval, this is a

trivial task. At the end of this paper (results section),

performance with other prediction intervals have

been tested as well.

4.1 Method Comparisons

The performance of the above-described methods

with default parameter values was measured over a

range of different testing datasets (presented in

Feature Engineering chapter) by using measure of

fitness – coefficient of determination, denoted as R

This is a standard measure of accuracy for

regression problems. Values can range between -1

and 1, where score of value 1 implies a perfect fit

(Draper and Smith, 1998). When performing

comparison tests, cross validation method was used

(shuffled split, with 3 iterations, and testing size of

20 % of given data set). All tests also include a

baseline predictor used for comparison (average

traffic flow of one week, presented in Average

Traffic Flow chapter).

Comparing models over different datasets shows

how feature enrichment consistently improves

prediction scores with all methods (Figure 5). The

first major improvement in prediction performance

can be observed when using dataset with additional

features derived from datetime information (dataset

called with_datetime). Comparing to results where

only traffic measures are used (only_measurements),

score is approximately tripled for all methods.

This shows the significance of date time

information, which is intuitive, and it would be

absurd not to use it when dealing with time series

data. The importance of datetime-related features

can also be observed by performing a feature

importance test Figure 4, in which “HourOfDay”

feature is rated as the most important, followed by

“DayOfWeek” and “Weekend” features, also

extracted from datetime information.

Figure 5: Methods and feature sets comparisons.

Another interesting improvement can be observed

with the Ridge Regression method, when using

dummy datetime features (dummy_datetime).

According to previous experiences, this

improvement was expected, but it is interesting to

observe how the performance of the SVR model has

drastically decreased in comparison to the SVR

model that uses dataset in which datetime features

were not categorized (with_datetime). Losing the

order information of datetime variable values by

splitting them into separate independent variables

effectively nullifies their usefulness to SVR, as the

TrafficFlowPredictionfromLoopCounterSensorDatausingMachineLearningMethods

123

score drops back down to the same value as without

datetime information. Even when “avrFlow” is

available, the dummy features decrease the

performance in comparison to the baseline.

However, it is interesting that none of the above

mentioned results have over scored the baseline

predictor. In short, the reason for such good result of

the baseline method is the choice of a fairly long

prediction horizon, i.e. 5h into the future. This is

explained in greater detail later in the Results

section.

But the most important result is that by using

dataset where localized average flow as a feature

(feature set with_avr_dummy), Random Forests and

the Ridge Regression method have been able to

outperform the baseline method. The dominant

importance of the new feature “AvrFlow” can be as

well observed in Figure 4. From this figure we can

observe how feature importance has changed with

different datasets and it is clear that the “AvrFlow”

feature is by far the most important feature, when

predicting 5 hours into the future.

4.2 Learning Curves

By visualizing learning curves, we can have a better

look at developed models and diagnose whether our

model is performing well, or if it can be improved.

Figure 6 shows how SVR models need many more

examples to attain a good prediction score, while the

Ridge Regression and Random Forests methods

outperform it event when much less data is

available. According to the slope of the curve, we

can also assume that for these two methods, it does

not seems that the scores would improve with more

samples.

The fact that training and cross validation scores

are almost the same, can indicate that the model

suffers from high bias (is under-fitted). This

situation can be observed for Ridge Regression

performance with the “only_measurements” dataset

in Figure 6. Usually, this happens when we have an

abundance of data but too few features. One

standard way to improve the performance of such

model is by adding more features. Indeed, we can

observe that Ridge Regression prediction score has

increased significantly by adding additional features.

By looking at the Random Forests learning

curves, we can observe the gap between the training

and cross validation scores, which might indicate

that we are dealing with high variance (over-fitting).

In such cases, we might improve our model by

obtaining more training examples, by trying smaller

sets of features, or by decreasing complexity. Since

we do not have more training examples, we will

attempt to improve our model by taking into account

two other suggestions, by tuning the model

parameters.

Parameter optimisation was done by performing

a grid search over several different options for

different parameters: number of trees (n_estimators),

number of features (max_features), and minimum

Figure 6: Learning curves for several methods and different feature sets.

VEHITS2015-InternationalConferenceonVehicleTechnologyandIntelligentTransportSystems

124

number of samples in newly created leaves

(min_samples_leaf). The best parameter set was

found with 100 trees and with a minimum of 10

samples in leaves. With this optimisation, we have

significantly reduced the size of the decision trees

(from 26,013 to 2,207 nodes) and decreased

complexity, which makes this model simpler and

faster. This optimisation also resulted in a slightly

better prediction score (Occam’s razor).

Additionally, a well-known Ada Boost method

was also tested in order to determine if we could

significantly improve our score of the Random

Forests model. Indeed, the score was improved even

further from 0.93 to 0.94. Since this add-on also

significantly slowed down the performance, we

decided not to use it, but this result indicates we

could possibly improve our score even further.

5 RESULTS

The last 20% of the original data set was reserved

for evaluation purposes (15 May–30 Jun). What we

can clearly see from the results (Table 2) is the

general trend of how an additional feature set

consistently improves prediction performance over

all tested methods.

Table 2: R

score results for different models and datasets:

(1) only_measurements, (2) dummy_datetime, (3)

with_avr_dummy.

Method Data set

1 2 3

Historical Average 0.91 0.91 0.91

Ridge Regression -0.08 0.80

0.92

Random Forests 0.30 0.88 0.91

SVR 0.27 0.27 0.84

The results show that all models performed best

by using the “with_avr” feature set. By using this

testing dataset, Ridge Regression was the best

method (0.92), followed by Random Forest (0.91)

and baseline-average flow (0.91). Unlike in previous

cross validated test, when Random Forest was the

best prediction model, now Ridge Regression

performed better. This might infers that Random

Forest is indeed slightly over fitted. However, it is

again interesting to observe how well the baseline

method performed again in comparison to other

machine learning methods.

Figure 7: Visualizing predictions from different models, with different feature sets.

TrafficFlowPredictionfromLoopCounterSensorDatausingMachineLearningMethods

125

Figure 8: Error (MAE) comparison for different prediction horizons.

With additional analysis, we have encountered that

this is due to the relatively long predictive horizon

that we have chosen (5 hours into the future). Figure

8 show the prediction error (Mean Absolute Error -

MAE) results for three different models: two naïve

predictors and a linear regression model for several

prediction horizons (from 5 minutes to 12 hour into

the future). The first naïve predictor is the current

status predictor, which takes only current

measurements into account and assumes that traffic

will remain constant. The second is the localized

average traffic status, which was used as a baseline

in previous tests. The third model is Ridge

Regression, since it has been found to be the best

predictor in comparison to other methods in this

section.

From Figure 8 we can observe how current status

measurements works better than average status for

short-term predictions (prediction horizon less than

1h), while the average status works better for long-

term predictions, which is intuitively true, since the

current status has less influence on the long term.

The results also illustrate how the prediction error

increases with larger prediction horizons. This is

obvious as well, since larger forecast intervals

correspond to larger uncertainty.

However, the most significant point from this

figure is that linear regression outperforms both

naïve methods for all prediction horizons. This is

because this method indirectly includes both models

outputs (current status and average status) as

Figure 9: Traffic data with prediction errors.

inputs, and learns how to weight them according to

specific prediction horizons. Therefore, it makes

more accurate predictions.

By analysing predictions with the largest error

(Figure 9), we can conclude that most of them are

caused by anomalous traffic patterns, such as

holidays, or very low or high traffic (probably due to

traffic accidents or possibly bad weather). Since

average traffic is considered to be the most

important feature in our model (due to long

prediction interval - 5h), it is to be expected that we

experience the largest errors with anomalies. By

using additional data sources that could describe

such anomalous traffic, we could probably increase

our prediction accuracy. Examples of such

additional data sources would be information about

holidays, weather prediction reports, traffic status

reports, etc.

6 CONCLUSIONS

In this research, we have compared the performance

of three machine learning methods (Ridge

Regression, SVR, and Random Forests) used for

predicting traffic flow. Feature engineering was also

described and recognized as a key component for

good results.

The results show that simple naïve methods, such

as historical average, are surprisingly effective when

making long-term predictions (more than one hour

into the future), while using current traffic

measurements as naïve method for prediction works

well when making more short term predictions (less

than 1h). This is to be expected, since current traffic

situation effect more on traffic in the nearby future,

then on traffic in a few hours or days. What is more

important is, that results shows, that by using

machine learning methods which includes both;

historical averages and current values, predictions

are better than both previously mentioned naïve

predictors, for all prediction horizons (short term

and long term).

VEHITS2015-InternationalConferenceonVehicleTechnologyandIntelligentTransportSystems

126

Also noteworthy is the fact that by constructing

high quality features, simple methods, such as linear

regression can work as well or even better than other

more sophisticated algorithms (such as Random

Forests, SVR, etc.). By using less complex models,

optimal model parameters are found more readily,

the models run a lot faster, and they are easier to

understand and maintain.

We also state that the main disadvantage of

models presented in this research, is its inability to

predict unusual traffic events. Even though common

traffic status is informative for a commuter in a new

environment, unusual traffic is the most informative

information for local commuter who is aware of

usual traffic. The main reason for this disadvantage

is that current models uses only historical traffic

data. Since, some of unusually traffic events are

caused by other related events (such as nearby traffic

accidents, bad weather, holidays, etc.), we believe

that by including additional data sources in the

model, prediction of such events could be

significantly improved.

Therefore, our future plan is to collect several

quality traffic related data sources (such as weather

forecasts, traffic alerts, special days statuses, bigger

social events, etc.) and fuse them with loop counters

data in order to generate better traffic prediction

models. We intend to test different data fusion

approaches, such as: early (or full) integration;

which transforms data sources into a single feature-

based table, late (or decision) integration; where

each data source give rise to a separate model and

predictions are later fused, and intermediate (or

partial) integration, where data are fused through

inference of a single joint model with a recent matrix

factorization based algorithms, providing very good

results in the field of bioinformatics (Žitnik and

Zupan, 2013).

ACKNOWLEDGEMENTS

This work was supported by Slovenian Research

Agency and the ICT Programme of the EC under

MobiS (FP7-318452).

REFERENCES

Draper, N.R. and Smith, H., 1998. Applied regression

analysis, New York: Wiley, 3

edition.

Van Hinsenberg, C.P.I., Lint, J.W.C. Van and Sanders,

F.M., 2007. Short Term Traffic Prediction Models.

Proc., 14th World Congress on Intelligent Transport

Systems: ITS for a Better Life.

Karlaftis, M.G. and Vlahogianni, E.I., 2011. Statistical

methods versus neural networks in transportation

research: Differences, similarities and some insights.

Transportation Research Part C: Emerging

Technologies, 19(3), pp.387–399.

Van Lint, J.W.C. and Van Hinsbergen, C.P.I.., 2012.

Short-Term Traffic and Travel Time Prediction

Models, in Artificial Intelligence Applications to

Critical Transportation Issues. Transportation

Research Circular, E-C168 (November).

Liu, Y. et al., 2006. A scalable distributed stream mining

system for highway traffic data. Knowledge Discovery

in …, pp.309–321.

Pedregosa, F. et al., 2011. Scikit-learn: Machine Learning

in Python. Journal of Machine Learning Research, 12,

pp.2825–2830.

Rice, J. and van Zwet, E., 2004. A simple and effective

method for predicting travel times on freeways.

Intelligent Transportation Systems, IEEE

Transactions, pp.200–207.

Stathopoulos, A. and Karlaftis, M.G., 2003. A multivariate

state space approach for urban traffic flow modeling

and prediction. Transportation Research Part C:

Emerging Technologies, 11(2), pp.121–135.

Vanajakshi, L. and Rilett, L.R., 2007. Support Vector

Machine Technique for the Short Term Prediction of

Travel Time. 2007 IEEE Intelligent Vehicles

Symposium, pp.600–605.

Vlahogianni, E.I., Karlaftis, M.G. and Golias, J.C., 2014.

Short-term traffic forecasting: Where we are and

where we’re going. Transportation Research Part C:

Emerging Technologies.

Zhang, J. et al., 2011. Data-Driven Intelligent

Transportation Systems: A Survey. IEEE Transactions

on Intelligent Transportation Systems, 12(4), pp.1624–

1639.

Žitnik, M. and Zupan, B., 2013. Data Fusion by Matrix

Factorization. IEEE Transactions on Pattern Analysis

and Machine Intelligence, pp.13.

TrafficFlowPredictionfromLoopCounterSensorDatausingMachineLearningMethods

127