ACC_NORM_VAR and GPS_SPD_MED), whereas the
other 2 models M1 and M3 are different: M1 differs
because it involves ACC_STD_V which brings the same
information as ACC_NORM_VAR. M3 differs because it
does not have access to ACC_NORM_VAR.
4 CONCLUSIONS
Given a classification (or regression) problem, due to
the number of different possible combinations of
sensors, features, classifiers and hyper-parameters,
finding an optimal classifier is a very time consuming
task.
This is why, simplifying the problem, using quick
data mining tools is very interesting.
In this study, we present three simple data mining
tools: Principal Component Analysis, Mahalanobis
distance and Linear Discriminant Analysis.
We apply them on real data concerning the
transportation mode classification problem and show
that we are able to
clean the data: we remove outliers
representing 11% of the samples
simplify the problem: we reduce data
dimension from 14 to 8 and this
simplification even improves the classifier
performance
study the importance of each of 8 features;
it turns out that feature ‘ACC_NORM_VAR’ is
very important whereas ‘MAG_NORM_STD’ can
be removed with a small effect on
performance (-0.01).
ACKNOWLEDGEMENTS
This work is part of the BONVOYAGE project which
has received funding from the European Union’s
Horizon 2020 research and innovation programme
under grant agreement No 635867.
REFERENCES
Anderson, I., Muller, H., 2006. Exploring GSM Signal
Strength Levels in Pervasive Environments, in: 20th
International Conference on Advanced Information
Networking and Applications, 2006. AINA 2006.
Presented at the 20th International Conference on
Advanced Information Networking and Applications,
2006. AINA 2006, pp. 87–91. https://doi.org/10.1109/
AINA.2006.176
Arlot, S., Celisse, A., 2010. A survey of cross-validation
procedures for model selection. Stat. Surv. 4, 40–79.
https://doi.org/10.1214/09-SS054
De Maesschalck, R., Jouan-Rimbaud, D., Massart, D.L.,
2000. The Mahalanobis distance. Chemom. Intell. Lab.
Syst. 50, 1–18. https://doi.org/10.1016/S0169-
7439(99)00047-7
Duda, R.O., Hart, P.E., Stork, D.G., 2001. Pattern
Classification by Richard O. Duda, David G. Stork,
Peter E.Hart .pdf.
Gu, Q., Li, Z., Han, J., 2012. Generalized fisher score for
feature selection. ArXiv Prepr. ArXiv12023725.
Hemminki, S., Nurmi, P., Tarkoma, S., 2013.
Accelerometer-based Transportation Mode Detection
on Smartphones, in: Proceedings of the 11th ACM
Conference on Embedded Networked Sensor Systems.
ACM, New York, NY, USA, p. 13:1–13:14.
https://doi.org/10.1145/2517351.2517367
Li, C., Georgiopoulos, M., Anagnostopoulos, G.C., 2011.
Kernel principal subspace Mahalanobis distances for
outlier detection, in: The 2011 International Joint
Conference on Neural Networks. Presented at the The
2011 International Joint Conference on Neural
Networks, pp. 2528–2535. https://doi.org/10.1109/
IJCNN.2011.6033548
Lorintiu, O., Vassilev, A., 2016. Transportation mode
recognition based on smartphone embedded sensors for
carbon footprint estimation, in: 2016 IEEE 19th
International Conference on Intelligent Transportation
Systems (ITSC). Presented at the 2016 IEEE 19th
International Conference on Intelligent Transportation
Systems (ITSC), pp. 1976–1981. https://doi.org/
10.1109/ITSC.2016.7795875
Manzoni, V., Maniloff, D., Kloeckl, K., Ratti, C., 2010.
Transportation mode identification and real-time CO2
emission estimation using smartphones.
Martinez, A.M., Kak, A.C., 2001. PCA versus LDA. IEEE
Trans. Pattern Anal. Mach. Intell. 23, 228–233.
https://doi.org/10.1109/34.908974
Nitsche, P., Widhalm, P., Breuss, S., Brändle, N., Maurer,
P., 2014. Supporting large-scale travel surveys with
smartphones – A practical approach. Transp. Res. Part
C Emerg. Technol. 43, 212–221. https://doi.org/
10.1016/j.trc.2013.11.005
Reddy, S., Mun, M., Burke, J., Estrin, D., Hansen, M.,
Srivastava, M., 2010. Using Mobile Phones to
Determine Transportation Modes. ACM Trans Sen
Netw 6, 13:1–13:27. https://doi.org/10.1145/1689239.
1689243
Sankaran, K., Zhu, M., Guo, X.F., Ananda, A.L., Chan,
M.C., Peh, L.-S., 2014. Using Mobile Phone Barometer
for Low-power Transportation Context Detection, in:
Proceedings of the 12th ACM Conference on
Embedded Network Sensor Systems. ACM, New York,
NY, USA, pp. 191–205. https://doi.org/10.1145/
2668332.2668343
Stenneth, L., Wolfson, O., Yu, P.S., Xu, B., 2011.
Transportation Mode Detection Using Mobile Phones
and GIS Information, in: Proceedings of the 19th ACM
SIGSPATIAL International Conference on Advances
Data Mining Applied to Transportation Mode Classification Problem
45