External Validity. For the random selection of the
texts to be summarized, is reduced the threat of in-
teraction of selections and treatments (that is having
a non-representative population sample). The lack of
a large calculation power does not allow the use of
optimal algorithms for the topic. For this, a series of
algorithms was used to have a comparison of their re-
sults, reducing the threat of interaction of settings and
treatments. Finally, the only threat to interaction of
history and treatment can come from new and more
powerful TS methods.
6 CONCLUSIONS
The main goal of this paper was to doubt on the valid-
ity of the ROUGE evaluation metric for TS algorithms
and after, try to understand if a single execution of
an algorithm led to better results than a multiple ex-
ecution one. From our experiments, we deduced that
ROUGE is not efficient, and that a multiple execu-
tion leads to better results than the single one (also if
evaluated by ROUGE). Summing up, a good ROUGE
score is not synonymous of good summary quality, if
we consider readability and syntactic correctness too.
For future developments, it will be possible to
extend the analysis to other algorithms, also if less
known. The goal may be to discover new approaches
that can directly evaluate the summary quality, avoid-
ing statistical measurements. One idea could be the
use of NLP algorithms for text comprehension. An-
other scenario may be the evaluation of summaries re-
lated to a specific topic, training different algorithms
with data from a narrow interest field, in order to have
interesting and more accurate results.
REFERENCES
Chen, L. and Le Nguyen, M. (2019). Sentence selective
neural extractive summarization with reinforcement
learning. In 11th Intl. Conf. on Knowl. and Sys. Eng.
(KSE), pages 1–5. IEEE.
Dalal, V. and Malik, L. (2013). A survey of extractive and
abstractive text summarization techniques. In 6th Inlt.
Conf. on Emerging Trends in Eng. and Tech., pages
109–110. IEEE.
de Oliveira, P. C. F. (2005). How to evaluate the ‘goodness’
of summaries automatically. PhD thesis, University
of Surrey.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2018). Bert: Pre-training of deep bidirectional trans-
formers for language understanding. arXiv preprint
arXiv:1810.04805.
Gupta, S. and Gupta, S. (2019). Abstractive summarization:
An overview of the state of the art. Expert Systems
with Applications, 121:49–65.
Han, X., Lv, T., Hu, Z., Wang, X., and Wang, C. (2016).
Text summarization using framenet-based semantic
graph model. Sci. Prog., 2016.
Janjanam, P. and Reddy, C. P. (2019). Text summariza-
tion: An essential study. In Intl. Conf. on Compu-
tational Intelligence in Data Science (ICCIDS), pages
1–6. IEEE.
Keyvanpour, M. R., Shirzad, M. B., and Rashidghalam,
H. (2019). Elts: A brief review for extractive
learning-based text summarizatoin algorithms. In 5th
Intl. Conf. on Web Research (ICWR), pages 234–239.
IEEE.
Khan, A., Salim, N., Farman, H., Khan, M., Jan, B., Ah-
mad, A., Ahmed, I., and Paul, A. (2018). Abstractive
text summarization based on improved semantic graph
approach. International Journal of Parallel Program-
ming, 46(5):992–1016.
Liu, F., Flanigan, J., Thomson, S., Sadeh, N., and
Smith, N. A. (2018). Toward abstractive summariza-
tion using semantic representations. arXiv preprint
arXiv:1805.10399.
Nallapati, R., Zhou, B., Gulcehre, C., Xiang, B.,
et al. (2016). Abstractive text summarization us-
ing sequence-to-sequence rnns and beyond. arXiv
preprint arXiv:1602.06023.
Nenkova, A. and Passonneau, R. J. (2004). Evaluating con-
tent selection in summarization: The pyramid method.
In Human Lang. Tech. Conf. of the North American
Ch. of the Assoc. for Comput. Ling. (HLT-NAACL),
pages 145–152.
Page, L., Brin, S., Motwani, R., and Winograd, T. (1999).
The pagerank citation ranking: Bringing order to the
web. Technical Report 1999-66, Stanford InfoLab.
Rezaei, A., Dami, S., and Daneshjoo, P. (2019). Multi-
document extractive text summarization via deep
learning approach. In 5th Conf. on Knowledge Based
Engineering and Innovation (KBEI), pages 680–685.
IEEE.
Suleiman, D. and Awajan, A. A. (2019). Deep learn-
ing based extractive text summarization: Approaches,
datasets and evaluation measures. In 6th Intll. Conf. on
Social Networks Analysis, Manag. and Sec. (SNAMS),
pages 204–210. IEEE.
Vadapalli, R., Kurisinkel, L. J., Gupta, M., and Varma, V.
(2017). Ssas: semantic similarity for abstractive sum-
marization. In 8th Intl. Joint Conf. on Natural Lang.
Proc., pages 198–203.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.
(2017). Attention is all you need. Adv. in Neural Inf.
Processing Systems, 30:5998–6008.
Verma, S. and Nidhi, V. (2017). Extractive sum-
marization using deep learning. arXiv preprint
arXiv:1708.04439.
Yasunaga, M., Zhang, R., Meelu, K., Pareek, A., Srini-
vasan, K., and Radev, D. (2017). Graph-based neu-
ral multi-document summarization. arXiv preprint
arXiv:1706.06681.
Yousefi-Azar, M. and Hamey, L. (2017). Text summariza-
tion using unsupervised deep learning. Expert Systems
with Applications, 68:93–105.
A Comparison of Methods for the Evaluation of Text Summarization Techniques
207