assignment is also either the same or higher than the
true positives obtained using the one-versus-all and
the round-robin assignments. The bracket assignment
was introduced to avoid some of the drawbacks of the
one-versus-all assignment for this application. The
latter assignment inherently implies the availability of
a large number of records for the unknown vehicles
as these records are exposed to all the clusters of all
the vocations at once. The bracket assignment over-
comes this limitation by comparing two vocations at
a time and was shown in this study to have a com-
parable performance to that of the one-versus-all as-
signment. The bracket assignment was also compared
to a round-robin assignment which scales with an in-
creasing number of vocations. The results show that
the bracket assignment has higher precision and recall
but most importantly has lower time complexity.
There are several directions that are being consid-
ered for future work including exploring the possi-
bility of reducing vocation confounding by applying
weights to specific features. In addition, the proposed
vocation identification algorithm relies on features ag-
gregated daily from the duty cycle of the vehicle over
a period of 13 days. Using data points collected over
shorter sample periods will enhance the applicability
of the algorithm to a wide range of vehicles.
ACKNOWLEDGEMENTS
This research was supported in part by Allison Trans-
mission, Inc.
REFERENCES
Athimethphat, M. and Lerteerawong, B. (2012). Bi-
nary classification tree for multiclass classifica-
tion with observation-based clustering. In 9th
International Conference on Electrical Engineer-
ing/Electronics, Computer, Telecommunications and
Information Technology, pages 1– 4.
Breiman, L. (2001). Random forests. Machine learning,
45(1):5–32.
Chakraborty, A., Faujdar, N., Punhani, A., and Saraswat, S.
(2020). Comparative study of k-means clustering us-
ing iris data set for various distances. In 10th Interna-
tional Conference on Cloud Computing, Data Science
& Engineering (Confluence), pages 332–335.
Daengduang, S. and Vateekul, P. (2017). Applying one-
versus-one svms to classify multi-label data with large
labels using spark. In 9th International Conference on
Knowledge and Smart Technology, pages 72 – 77.
Duran, A., Phillips, C., Perr-Sauer, J., Kelly, K., and Konan,
A. (2018). Leveraging big data analysis techniques for
us vocational vehicle drive cycle characterization, seg-
mentation, and development. Technical report, SAE
Technical Paper.
Ester, M., Kriegel, H., Sander, J., Xu, X., et al. (1996).
A density-based algorithm for discovering clusters in
large spatial databases with noise. Kdd, 96(34):226–
231.
Kanemaru, Y., Matsuura, S., Kakiuchi, M., Noguchi, S., In-
omata, A., and Fujikawa, K. (2013). Vehicle cluster-
ing algorithm for sharing information on traffic con-
gestion. In 13th International Conference on ITS
Telecommunications, pages 38–43. IEEE.
Kennedy, J. and Eberhart, R. (1995). Particle swarm opti-
mization. In International Conference on Neural Net-
works, volume 4, pages 1942–1948. IEEE.
Khalid, S., Khalil, T., and Nasreen, S. (2014). A survey of
feature selection and feature extraction techniques in
machine learning. In Science and Information Con-
ference, pages 372–378. IEEE.
McInnes, L. and Healy, J. (2017). Accelerated hierarchical
density based clustering. In International Conference
on Data Mining Workshops (ICDMW), pages 33–42.
IEEE.
Murphy, P. and Pazzani, M. (1991). Id2-of-3: Constructive
induction of m-of-n concepts for discriminators in de-
cision trees. In Machine Learning Proceedings, pages
183–187. Elsevier.
NREL (2019). Fleet dna project data.
Sagi, O. and Rokach, L. (2018). Ensemble learning: A sur-
vey. WIREs: Data Mining & Knowledge Discovery,
8(4):1.
Scholkopf, B. and Smola, A. (2001). Learning with ker-
nels: support vector machines, regularization, opti-
mization, and beyond. MIT press.
Shin, Y., Goh, Y., Lee, C., and Chung, J. (2019). Effective
data structure for smart big data systems applying an
expectation-maximization algorithm. In Third World
Conference on Smart Trends in Systems Security and
Sustainablity (WorldS4), pages 136–140.
Wahba, G. (2002). Soft and hard classification by reproduc-
ing kernel hilbert space methods. Proceedings of the
National Academy of Sciences, 99(26):16524–16530.
Wang, J., Yuan, Y., Ni, T., Ma, Y., Liu, M., Xu, G., and
Shen, W. (2020). Anomalous trajectory detection and
classification based on difference and intersection set
distance. IEEE Transactions on Vehicular Technology,
69(3):2487–2500.
VEHITS 2021 - 7th International Conference on Vehicle Technology and Intelligent Transport Systems
266