Deep Convolution Neural Network for Extreme Multi-label Text Classification

Francesco Gargiulo, Stefano Silvestri, Mario Ciampi

2018

Abstract

In this paper we present an analysis on the usage of Deep Neural Networks for extreme multi-label and multiclass text classification. We will consider two network models: the first one is formed by a word embeddings (WEs) stage followed by two dense layers, hereinafter Dense, and a second model with a convolution stage between the WEs and the dense layers, hereinafter CNN-Dense. We will take into account classification problems characterized by different number of labels, ranging from an order of 10 to an order of 30; 000, showing the different performances of the neural networks varying the total label number and the average number of labels for sample, exploiting the hierarchical structure of the label space of the dataset used for experimental assessment. It is worth noting that multi-label classification is an harder problem if compared to multi-class, due to the variable number of labels associated to each sample. We will even investigate on the behaviour of the neural networks as function of the training hyperparameters, analysing the link between them and the dataset complexity. All the result will be evaluated using the PubMed scientific articles collection as test case.

Download


Paper Citation


in Harvard Style

Gargiulo F., Silvestri S. and Ciampi M. (2018). Deep Convolution Neural Network for Extreme Multi-label Text Classification.In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: AI4Health, ISBN 978-989-758-281-3, pages 641-650. DOI: 10.5220/0006730506410650


in Bibtex Style

@conference{ai4health18,
author={Francesco Gargiulo and Stefano Silvestri and Mario Ciampi},
title={Deep Convolution Neural Network for Extreme Multi-label Text Classification},
booktitle={Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: AI4Health,},
year={2018},
pages={641-650},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006730506410650},
isbn={978-989-758-281-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: AI4Health,
TI - Deep Convolution Neural Network for Extreme Multi-label Text Classification
SN - 978-989-758-281-3
AU - Gargiulo F.
AU - Silvestri S.
AU - Ciampi M.
PY - 2018
SP - 641
EP - 650
DO - 10.5220/0006730506410650