Authors:
Karel Macek
1
;
Nicholas Čapek
1
and
Nikola Pajerová
2
Affiliations:
1
AI Center of Excellence, Generali Česká pojišťovna, Na Pankráci 1720, Prague, Czechia
;
2
Department of Technical Mathematics, Faculty of Mechanical Engineering, CTU, Resslova 307, Prague, Czechia
Keyword(s):
Machine Learning, Classification, Regression, Random Sample, Vectorization, Image Similarity, Hip Bone, 3D Scans.
Abstract:
Machine Learning has been working with various inputs, including multimedia or graphs. Some practical applications motivate using unordered sets considered to be samples from a probability distribution. These sets might be significant in size and not fixed in length. Standard sequence models do not seem appropriate since the order does not play any role. The present work examines four alternative transformations of these inputs into fixed-length vectors. This paper demonstrates the approach in two case studies. In the first one, pairs of scans as coming from the same document based were classified on the distribution of lengths between the reference points. In the second one, the person’s age based on the distribution of D1 characteristics of the 3D scan of their hip bones was predicted.