far. However, the connection set should be updated
when environment is changed. The distillation strat-
egy employed in Foster (Wang et al., 2022) is also
useful for reducing and re-adjusting the subnetworks.
The distillation operation will give a clear opportunity
to make each subnetwork and corresponding aggrega-
tion layer learn class boundaries.
8 CONCLUSION
In this paper, a hidden neural network based incre-
mental learning method: Swap deep neural network
is proposed. This model has been followed two ad-
vantages:
1. The hidden neural network can reuse its neurons
in several different tasks by reconfiguration of the
neural network circuit. So, this architecture is
suitable for a small embedded systems, where the
storage capacity is limited.
2. The sub-network is not modified after the cre-
ation. This means that the system does not cause
the catastrophic forgetting.
The simulation results suggest that Swap-NN re-
alizes an effective execution of large-scale neural net-
works with a small amount of resources.
REFERENCES
Aljundi, R., Babiloni, F., Elhoseiny, M., Rohrbach, M.,
and Tuytelaars, T. (2018). Memory aware synapses:
Learning what (not) to forget. In Proceedings of
the European Conference on Computer Vision (ECCV
2018).
French, R. M. (1997). Pseudo-recurrent connectionist
networks: An approach to the “sensitivity stability”
dilemma. Connection Science, 9(4):353–379.
French, R. M. (1999). Catastrophic forgetting in connec-
tionist networks. TRENDS in Cognitive Sciences,
3(4):128–135.
Freund, Y. and Schapire, R. E. (1997). A decision-theoretic
generalization of on-line learning and an application
to boosting. Journal of Computer and System Sci-
ences, 55(1):119–139.
Fukushima, K. (1975). Cognitron: A self-organizing mul-
tilayered neural network. Biological Cybernetics,
20(3):121–136.
Hayes, T. L., Kafle, K., Shrestha, R., Acharya, M., and
Kanan, C. (2019). Remind your neural network to
prevent catastrophic forgetting. CoRR.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving
deep into rectifiers: Surpassing human-level perfor-
mance on imagenet classification. In 2015 IEEE In-
ternational Conference on Computer Vision (ICCV),
pages 1026–1034.
Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the
knowledge in a neural network. arXiv.org.
Hirose, K., Yu, J., Ando, K., Okoshi, Y., Garc
´
ıa-Arias,
´
A. L., Suzuki, J., Chu, T. V., Kawamura, K., and Mo-
tomura, M. (2022). Hiddenite: 4k-pe hidden network
inference 4d-tensor engine exploiting on-chip model
construction achieving 34.8-to-16.0tops/w for cifar-
100 and imagenet. In 2022 IEEE International Solid-
State Circuits Conference (ISSCC), volume 65, pages
1–3.
Hsu, Y.-C., Liu, Y.-C., Ramasamy, A., and Kira, Z. (2018).
Re-evaluating continual learning scenarios: A cate-
gorization and case for strong baselines. In Ben-
gio, S., Wallach, H., Larochelle, H., Grauman, K.,
Cesa-Bianchi, N., and Garnett, R., editors, Advances
in Neural Information Processing Systems 31. Curran
Associates, Inc.
Ishikawa, M. (1996). Structural learning with forgetting.
Neural Networks, 9(3):509–521.
Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J.,
Desjardins, G., Rusu, A. A., Milan, K., Quan, J., and
Ramalho, T. (2017). Overcoming catastrophic forget-
ting in neural networks. Proceeding of the National
Acacemy of United States of America, 114(13):3521–
3526.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012).
Imagenet classification with deep convolutional neu-
ral networks. In Pereira, F., Burges, C. J. C., Bottou,
L., and Weinberger, K. Q., editors, Advances in Neu-
ral Information Processing Systems 25, pages 1097–
1105. Curran Associates, Inc.
Li, Z. and Hoiem, D. (2016). Learning without forgetting.
In European Conference on Computer Vision 2016,
pages 614–629. SCITEPRESS ? Science and Technol-
ogy Publications, Lda.
Luo, L., Xiong, Y., Liu, Y., and Sun, X. (2019). Adap-
tive gradient methods with dynamic bound of learning
rate. In International Conference on Learning Repre-
sentations ICLR2019.
Mallya, A. and Lazebnik, S. (2018). Packnet: Adding mul-
tiple tasks to a single network by iterative pruning.
In CVPR, Conference on computer vision and pattern
recognition 2018, pages 7765–7773.
Ramanujan, V., Wortsman, M., Kembhavi, A., Farhadi, A.,
and Rastegari, M. (2020). What’s hidden in a ran-
domly weighted neural network? arXiv.org.
Rusu, A. A., Rabinowitz, N. C., Desjardins, G., Soyer,
H., Kirkpatrick, J., Kavukcuoglu, K., Pascanu, R.,
and Hadsell, R. (2016). Progressive neural networks.
CoRR.
Sadel, J., Kawulok, M., Przeliorz, M., Nalepa, J., and
Kostrzewa, D. (2023). Genetic structural nas: A neu-
ral network architecture search with flexible slot con-
nections. In GECCO’23 Companion: Proceedings of
the Companion Conference on Genetic and Evolution-
ary Computation, pages 79–80. Association for Com-
puting Machinery.
Wang, F.-Y., Zhou, D.-W., Ye, H.-J., and Zhan, D.-C.
(2022). Foster: Feature boosting and compression
for class-incremental learning. In ECCV 2022: 17th
ICPRAM 2024 - 13th International Conference on Pattern Recognition Applications and Methods
426