shows it is able to extract common features for all
tasks and task-specific features in each task, separa-
tely.
6 CONCLUSION
In this work, we proposed Separation Multi-task Net-
works, a novel multi-task learning method that si-
multaneously extracts features shared between tasks,
and task-specific features in each task. Our propo-
sed method was able to train and inference taking
into account features shared between all tasks and
task-specific in each task. Moreover, by introducing
channel-wise convolution, our proposed method was
able to adjust the number of channels of the feature
maps input to each task-specific layers and fine-tune
each task-specific layers. In experiments, our Separa-
tion Multi-task Networks performed facial landmark
detection and facial attribute estimation on the Ce-
lebA dataset and outperformed the existing methods
in both tasks. Then, in multi-task learning, we sho-
wed that it is effective to use features shared between
tasks more supplementally for each task.
Future tasks include applying Separation Multi-
task Networks to images other than facial image. In
addition, our proposed method separates shared fea-
tures shared between tasks and task-specific features
in each task by two-stage training. Therefore, it is
considered that Separation Multi-task Networks can
be improved to train with end-to-end training while
separating these two features. By improving the net-
work model of our proposed method, it is considered
the number of parameters can be reduced by changing
the activation function to CReLU(Shang et al., 2016).
REFERENCES
Caruana, R. (1998). Multitask learning. In Learning to
learn, volume 1, pages 95–133. Springer.
Dai, J., He, K., and Sun, J. (2016). Instance-aware seman-
tic segmentation via multi-task network cascades. In
Conference on Computer Vision and Pattern Recogni-
tion, pages 3150–3158.
Duong, L., Cohn, T., Bird, S., and Cook, P. (2015). Low
resource dependency parsing: Cross-lingual parame-
ter sharing in a neural network parser. In Procee-
dings of the 53rd Annual Meeting of the Association
for Computational Linguistics and the 7th Internati-
onal Joint Conference on Natural Language Proces-
sing, volume 2, pages 845–850.
Feichtenhofer, C., Pinz, A., and Zisserman, A. (2017). De-
tect to track and track to detect. In International Con-
ference on Computer Vision.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep re-
sidual learning for image recognition. In Computer
Vision and Pattern Recognition, pages 770–778.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012).
Imagenet classification with deep convolutional neu-
ral networks. In Neural Information Processing Sys-
tems, pages 1097–1105.
Kumar, N., Belhumeur, P. N., and Nayar, S. K. (2008). Fa-
cetracer: A search engine for large collections of ima-
ges with faces.
Liu, P., Qiu, X., and Huang, X. (2017). Adversarial multi-
task learning for text classification. In Proceedings of
the 55th Annual Meeting of the Association for Com-
putational Linguistics, pages 1–10.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu,
C.-Y., and Berg, A. C. (2016). Ssd: Single shot mul-
tibox detector. In European Conference on Computer
Vision.
Liu, Z., Luo, P., Wang, X., and Tang, X. (2015). Deep lear-
ning face attributes in the wild. In International Con-
ference on Computer Vision.
Lv, J.-J., Shao, X., Xing, J., Cheng, C., Zhou, X., et al.
(2017). A deep regression architecture with two-stage
re-initialization for high performance facial landmark
detection. In Computer Vision and Pattern Recogni-
tion.
Misra, I., Shrivastava, A., Gupta, A., and Hebert, M. (2016).
Cross-stitch networks for multi-task learning. In Con-
ference on Computer Vision and Pattern Recognition.
Ruder, S. (2017). An overview of multi-task lear-
ning in deep neural networks. arXiv preprint
arXiv:1706.05098.
Shang, W., Sohn, K., Almeida, D., and Lee, H. (2016). Un-
derstanding and improving convolutional neural net-
works via concatenated rectified linear units. In Inter-
national Conference on Machine Learning.
Simonyan, K. and Zisserman, A. (2015). Very deep con-
volutional networks for large-scale image recognition.
In International Conference on Learning Representa-
tions.
Wei, S.-E., Ramakrishna, V., Kanade, T., and Sheikh, Y.
(2016). Convolutional pose machines. In Conference
on Computer Vision and Pattern Recognition.
Yang, Y. and Hospedales, T. M. (2016). Trace norm re-
gularised deep multi-task learning. arXiv preprint
arXiv:1606.04038.
Zhang, N., Paluri, M., Ranzato, M., Darrell, T., and Bour-
dev, L. (2014a). Panda: Pose aligned networks for
deep attribute modeling. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion, pages 1637–1644.
Zhang, Z., Luo, P., Loy, C. C., and Tang, X. (2014b). Facial
landmark detection by deep multi-task learning. In
European Conference on Computer Vision.
Zhao, Y., Tang, F., Dong, W., Huang, F., and Zhang, X.
(2018). Joint face alignment and segmentation via
deep multi-task learning. Multimedia Tools and Ap-
plications, pages 1–18.
VISAPP 14th International Conference on Computer Vision Theory and Applications - 14th International Conference on Computer Vision
Theory and Applications
272