Authors:
Omar Makke
1
;
Syam Chand
2
;
Vamsee Batchu
2
;
Oleg Gusikhin
1
and
Vicky Svidenko
1
Affiliations:
1
Ford Motor Company, U.S.A.
;
2
Ford Motor Company, India
Keyword(s):
Connected Vehicles, Sampling, Big Data, Large Language Model Application, Generative AI.
Abstract:
The impact of connected vehicle big data on the automotive industry is significant. Big data offers data scientists the opportunity to explore and analyze vehicle features and their usage thoroughly to assist in optimizing existing designs or offer new features. However, the downside of big data is its associated cost. While storage tends to be cheap, data transmission and computational resources are not. Specifically, for connected vehicle data, even when unstructured data is excluded, the data size can still increase by several terabytes a day if one is not careful about what data to collect. Therefore, it is advisable to apply methods which help avoiding collecting redundant data to reduce the computation cost. Furthermore, some data scientists may be tempted to calculate “exact” metrics when the data is available, partly because applying statistical methods can be tedious, which can exhaust the computational resources. In this paper we argue that intelligent sampling systems whic
h centralize the sampling methods and domain knowledge are required for connected vehicle big data. We also present our system which assists interested parties in performing analytics and provide two case studies to demonstrate the benefits of the system.
(More)