We find it quite remarkable that both CL and SL
exhibit such impressive subitizing error performance
in baseline test scenarios even at super-human ranges.
We did not inquire into the existence of a practical
upper bound to this range nor, more generally, into
the error performance as function of the dataset range,
which of course should be of interest.
The results shown indicate that CL and SL perform
quite similar to each other in shape generalization
tests, both with significant error probability
degradation but with robust figures w.r.t. conditional
distance distribution. This might be expected since
the natural numbers conservation principle guiding
CL should not grant it any advantage when dealing
with varying shapes.
In contrast, CL alone maintains this robustness in
quantity generalization tests as well, and in particular
in range extension tests, where the scheme is asked to
estimate at a glance a quantity bigger than what it was
ever exposed to. This seems to support our motivating
conjecture that stated that CL, because it is guided by
the principle of conservation of natural numbers
should obtain, after training, a more deep and
grounded sense of ‘numberness’. And this would then
be the second case in a row at which CL seems to
show a fundamental superiority over SL in generating
information rich representations (the first being the
forementioned capability of detection of objects
attributes (Nissani (Nissensohn), 2023)).
In this introductory work we have devised and
employed a toy problem to demonstrate our ideas
because it has better control of test bench variables
and parameters. It should be of course important to
evaluate whether similar results can be achieved in
more realistic scenarios which may include clutter,
occlusion, composite scenes with different classes of
real-life objects, etc. Several datasets for such
purpose are already available (Acharya et al., 2019).
REFERENCES
Acharya, M., Kafle, K., Kanan, C., ‘TallyQA: Answering
Complex Counting Questions’, AAAI 2019
Chattopadhyay, P., Vedantarn, R., Selvaraju, R.R., Batra,
D., Parikh, D., ‘Counting Everyday Objects in
Everyday Scenes’, CVPR 2017
Chen, T., Kornblith, S., Norouzi, M., Hinton, G., ‘A Simple
Framework for Contrastive Learning of Visual
Representations’, PMLR 2020
Chen, X., He, K., ‘Exploring Simple Siamese
Representation Learning’, arXiv 2011.10566, 2020
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.,
‘ImageNet: A Large-Scale Hierarchical Image
Database’, CVPR 2009
Grill, J., Strub, F., Altche, F., Tallec, C., Richemond, P.H.,
Buchatskaya, E., Doersch, E., Pires, B.A., Guo, Z.D.,
Azar, M.G., Piot, B., Kavukcuoglu, K., Munos, R.,
Valko, M., ‘Bootstrap Your Own Latent. A New
Approach to Self-Supervised Learning’, arXiv
2006.07733, 2020
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R., ‘Momentum
Contrast for Unsupervised Visual Representation
Learning’, arXiv 1911.05722v3
Kaufman, E.L., Lord, M.W., Reese, T.W., Volkmann, J.,
‘The Discrimination of Visual Number’, The American
Journal of Psychology, 1949, pp. 498-525
Kingma, D.P., Ba, J.L., ‘ADAM: A Method for Stochastic
Optimization’, ICLR 2015
Martens, J., ‘Deep learning via Hessian-free optimization’,
Proceedings of the 27
th
International Conference on
Machine Learning, 2010
Nissani (Nissensohn), D.N., ‘Contrastive Learning and the
Emergence of Attribute Associations’, ICANN 2023
Oord, A.v.d., Li, Y., Vinyals, O., ‘Representation Learning
with Contrastive Predictive Coding’, arXiv
1807.03748, 2018
Simard, P.Y., Steinkraus, D., Platt, J.C., ‘Best Practices for
Convolutional Neural Networks Applied to Visual
Document Analysis’, ICDAR 2003
Tian, Y., Sun, C., Poole, B., Krishnan, D., Schmid, C.,
Isola, P., ‘What Makes for Good Views for Contrastive
Learning?’, NIPS 2020
Trick, L.M., Pylyshyn, Z.W., ‘Why are Small and Large
Numbers Enumerated Differently? A Limited Capacity
Pre-attentive Stage in Vision”, Psychological Review,
1994
Wang T., Isola, P., ‘Understanding Contrastive
Representation Learning through Alignment and
Uniformity on the Hypersphere’, ICML 2020