DTATG: An Automatic Title Generator based on Dependency Trees
Liqun Shao, Jie Wang
2016
Abstract
We study automatic title generation for a given block of text and present a method called DTATG to generate titles. DTATG first extracts a small number of central sentences that convey the main meanings of the text and are in a suitable structure for conversion into a title. DTATG then constructs a dependency tree for each of these sentences and removes certain branches using a Dependency Tree Compression Model we devise. We also devise a title test to determine if a sentence can be used as a title. If a trimmed sentence passes the title test, then it becomes a title candidate. DTATG selects the title candidate with the highest ranking score as the final title. Our experiments showed that DTATG can generate adequate titles. We also showed that DTATG-generated titles have higher F1 scores than those generated by the previous methods.
References
- David M. Blei, A. Y. N. and Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of machine Learning research, 3(2003): 993-1022.
- D. Greene and P. Cunningham. (2006). Practical solutions to the problem of diagonal dominance in kernel document clustering. In ICML.
- Kevin Knight and Daniel Marcu. (2002). Summarization beyond sentence extraction: A probabilistic approach to sentence compression. In Artificial Intelligence , 139(1): 91-107.
- Rong Jin and Alexander G. Hauptmann. (2001). Automatic Title Generation for Spoken Broadcast News. In Proceedings of HLT-01, 2001, pp. 1lC3.
- Stuart Rose, Dave Engel, Nick Cramer and Wendy Cowley. (2010). Automatic keyword extraction from individual documents. In Text mining applications and theory, 2010, pp. 3-19.
- Jenine Turner and Eugene Charniak. (2005). Supervised and unsupervised learning for sentence compression. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Mich., 25-30 June 2005, pp. 290-297.
- Vincent Vandegehinste and Yi Pan. (2004). Sentence compression for automated subtitling: A hybrid approach. In Proceedings of the ACL-04, 2004 (pp. 89-95).
- Y. Matsuo and M. Ishizuka. (2004). Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools, 13(1): 157-169.
- Sylvain Kahane. (2012). Why to choose dependency rather than constituency for syntax: a formal point of view. In Meanings, Texts, and other exciting things: A Festschrift to Commemorate the 80th Anniversary of Professor Igor A. Mel'c?uk, Languages of Slavic Culture, Moscou, pp. 257-272.
- Yonglei Zhang, Cheng Peng, and Hongling Wang. (2013). Research on Chinese Sentence Compression for the Title Generation. Chinese Lexical Semantics, edited by Xiao Guozheng & Ji Donhong. Heideberg: Springer.
Paper Citation
in Harvard Style
Shao L. and Wang J. (2016). DTATG: An Automatic Title Generator based on Dependency Trees . In Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016) ISBN 978-989-758-203-5, pages 166-173. DOI: 10.5220/0006035101660173
in Bibtex Style
@conference{kdir16,
author={Liqun Shao and Jie Wang},
title={DTATG: An Automatic Title Generator based on Dependency Trees},
booktitle={Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)},
year={2016},
pages={166-173},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006035101660173},
isbn={978-989-758-203-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2016)
TI - DTATG: An Automatic Title Generator based on Dependency Trees
SN - 978-989-758-203-5
AU - Shao L.
AU - Wang J.
PY - 2016
SP - 166
EP - 173
DO - 10.5220/0006035101660173