Authors:
Frederik Timme
;
Jochen Kerdels
and
Gabriele Peters
Affiliation:
Chair of Human-Computer Interaction, Faculty of Mathematics and Computer Science, University of Hagen, Universitätsstraße 47, 58097 Hagen, Germany
Keyword(s):
Convolutional Neural Networks, Performance Evaluation, Transformations, Data Augmentation.
Abstract:
Convolutional Neural Networks (CNNs) have become the dominant and arguably most successful approach for the task of image classification since the release of AlexNet in 2012. Despite their excellent performance, CNNs continue to suffer from a still poorly understood lack of robustness when confronted with adversarial attacks or particular forms of handcrafted datasets. Here we investigate how the recognition performance of three widely used CNN architectures (AlexNet, VGG19 and ResNeXt) changes in response to certain input data transformations. 10,000 images from the ILSVRC2012s validation dataset were systematically manipulated by means of common transformations (translation, rotation, color change, background replacement) as well as methods like image collages and jigsaw-like puzzles. Both the effect of single and combined transformations are investigated. Our results show that three of these input image manipulations (rotation, collage, and puzzle) can cause a significant drop in
classification accuracy in all evaluated architectures. In general, the more recent VGG19 and ResNeXt displayed a higher robustness than AlexNet in our experiments indicating that some progress has been made to harden the CNN approach against malicious or unforeseen input.
(More)