Figure 3: Training loss and testing accuracy on CNN.
accuracy and gave us more confidence in using
sampled instances for siamese network as well.
Figure 4 shows the testing loss on siamese
networks with and without importance sampling fine-
tuning on regular dataset and dataset that includes 50%
similar (labels) samples.
Figure 4: Testing loss on siamese Networks.
The results demonstrate the efficacy of using
importance sampling in getting better accuracy of the
trained model.
8 CONCLUSION
We have presented a practical method to enhance the
training and accuracy of siamese network. We were
able to demonstrate that importance sampling, a
variance reduction method can successfully improve
the training and testing accuracy of the siamese
network. This the first known attempt to combine
importance sampling with siamese network. Unlike
regular CNN, siamese networks can scale to recognize
images at a very large scale with hundreds of classes
or subjects. We have empirically demonstrated the
validity of using importance sampling to fine-tune the
training. Future work on it will involve further
optimization of the importance sampling to train
siamese and other types of networks.
REFERENCES
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z.,
Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M.
and Others 2015. Tensorflow: Large-scale machine
learning on heterogeneous distributed systems. arXiv
preprint.
Alain, G., Lamb, A., Sankar, C., Courville, A. and Bengio,
Y. 2015. Variance Reduction in SGD by Distributed
Importance Sampling. eprint arXiv:1511.06481.
Bennette, W. D. 2014. Instance selection for model-based
classifiers. Graduate Theses and Dissertations, 13783.
Chitta, R., Jin, R. and Jain, A. K. Stream Clustering:
Efficient Kernel-Based Approximation Using
Importance Sampling. 2015 IEEE International
Conference on Data Mining Workshop (ICDMW), 14-
17 Nov. 2015 2015. 607-614.
Chollet, F. and Others 2015. Keras. https://keras.io.
Forsyth, D. A., Haddon, J. and Ioffe, S. 2001. The Joy of
Sampling. International Journal of Computer Vision,
41, 109-134.
Guyon, I., Andr, #233 & Elisseeff 2003. An introduction to
variable and feature selection. J. Mach. Learn. Res., 3,
1157-1182.
Hadsell, R., Chopra, S. and Lecun, Y. Dimensionality
Reduction by Learning an Invariant Mapping. 2006
IEEE Computer Society Conference on Computer
Vision and Pattern Recognition (CVPR'06), 2006 2006.
1735-1742.
James, G., Witten, D., Hastie, T. and Tibshirani, R. 2013.
An Introduction to Statistical Learning. New York, NY :
Springer, 103.
Katharopoulos, A. and Fleuret, F. 2018. Not All Samples
Are Created Equal: Deep Learning with Importance
Sampling. CoRR, abs/1803.00942.
Muja, M. and Lowe, D. G. 2014. Scalable Nearest Neighbor
Algorithms for High Dimensional Data. IEEE
Transactions on Pattern Analysis and Machine
Intelligence, 36, 2227-2240.
Olvera-López, J. A., Carrasco-Ochoa, J. A., Martínez-
Trinidad, J. F. and Kittler, J. 2010. A review of instance
selection methods. Artificial Intelligence Review, 34,
133-143.
Riad, R., Dancette, C., Karadayi, J., Zeghidour, N., Schatz,
T. and Dupoux, E. 2018. Sampling strategies in Siamese
Networks for unsupervised speech representation.
Computing Research Repository (CoRR),
abs/1804.11297.
0,9848
0,976
0,9969
0,9895
0,96
0,97
0,98
0,99
1
Training Accuracy Test Accuracy
3 layer CNN
No Sampling Importance Sampling
0,12
0,31
0,11
0,29
0
0,1
0,2
0,3
0,4
Regular Dataset With 50% similar
labels
Siamese Network - Testing
No Sampling Importance Sampling
ICAART 2019 - 11th International Conference on Agents and Artificial Intelligence
614