Samira Shaikh, Tomek Strzalkowski, Nick Webb


Classification of dialogue acts constitutes an integral part of various natural language processing applications. In this paper, we present an application of this task to Urdu language online multi-party discourse. With language specific modifications to established techniques such as permutation of word order in detected n-grams and variation of n-gram location, we developed an approach that is novel to this language. Preliminary performance results when compared to baseline are very encouraging for this approach.


  1. Allen, J. M. Core. 1997. Draft of DAMSL: Dialog Act Markup in Several Layers. research/cisd/resources/damsl/
  2. Carletta, J. 2007. Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus. Language Resources and Evaluation Journal 41(2): 181-190
  3. Durrani, N., Hussain, S. 2010. Urdu Word Segmentation. In the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT 2010), Los Angeles, US, 2010
  4. Eric N. Forsyth and Craig H. Martell. 2007. Lexical and Discourse Analysis of Online Chat Dialog. First IEEE International Conference on Semantic Computing (ICSC 2007), pp. 19-26.
  5. Fraser, B. 1990. An Approach to Discourse Markers. Journal of Pragmatics. 14:383-395
  6. Grosz, B. and C. Sidner., 1986. Attention, Intentions, and the Structure of Discourse. Computational Linguistics. 12 (3):175-204.
  7. Heeman, P., D. Byron, and J. Allen. 1998. Identifying Discourse Markers in Spoken Dialog. In Applying Machine Learning to Discourse Processing: Papers from the 1998 American Association for Artificial Intelligence Spring Symposium. 44-51. Stanford, California.
  8. Hirschberg, J. and D. Litman. 1993. Empirical Studies on the Disambiguation of Cue Phrases. Computational Linguistics. 19(3):501-530.
  9. Hussain, S. 2008. Resources for Urdu Language Processing. In Proceedings of the 6th Workshop on Asian Language Resources. IJCNLP'08, IIIT Hyderabad, India.
  10. Ijaz, M and Hussain, S. 2007. Corpus Based Urdu Lexicon Development, In Proceedings of Conference on Language Technology (CLT07), University of Peshawar, Pakistan.
  11. Janin, A., Baron, D., Edwards D., Gelbart D., Morgan N., Peskin B., Pfau T., Shriberg E., Stolcke A., Wooters C. 2003. The ICSI Meeting Corpus. In Proc. ICASSP. Hong Kong.
  12. Jurafsky, Dan, Elizabeth Shriberg, and Debra Biasca. 1997. Switchboard SWBD-DAMSL ShallowDiscourse-Function Annotation Coders Manual. ml
  13. Khan F. M., T. A. Fisher, L. Shuler, T. Wu, and W. M. Pottenger. 2002. Mining chatroom conversations for Social and Semantic Interactions. Technical Report LU-CSE-02-011, Lehigh University.
  14. Kim, Jihie., Shaw, Erin., Chern, Grace. and Donghui Feng. 2007. An Intelligent Discussion-Bot for Guiding Student Interactions in Threaded Discussions. AAAI Spring Symposium on Interaction Challenges for Intelligent Assistants.
  15. Krippendorff, K. 2005. Computing Krippendorff's alphareliability. Technical Report. University of Pennsylvania. PA. lity2.pdf
  16. Marcu, D. 1997. The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. Ph.D. thesis. University of Toronto, Toronto, Canada. Tech Report #CSRG-371.
  17. Raza, A. A., Athar, A., Nadeem, S. 2009a. N-GRAM Based Authorship Attribution in Urdu Poetry. In the Proceedings of the Conference on Language and Technology 2009 (CLT09), FAST NU, Lahore, Pakistan, 22-24 Jan 2009
  18. Raza, A.A., Hussain, S., Sarfraz, H., Ullah, I., Sarfraz, Z. 2009b. Design and development of phonetically rich Urdu Speech Corpus. In Proceedings of OCOCOSDA'09. School of Information Science and Engineering of Xinjiang University, Urunqi, China
  19. Reichman, R. 1985. Getting Computers to Talk Like You and Me: Discourse Context, Focus, and Semantics. MIT Press, Cambridge, Massachusetts.
  20. Samuel, K.; Carberry, S.; and Vijay-Shanker, K. 1999. Automatically selecting useful phrases for dialogue act tagging. In Proceedings of the Fourth Conference of the Pacific Association for Computational Linguistics, Waterloo, Ontario, Canada.
  21. Schiffrin, D. 1987. Discourse Markers. Cambridge University Press, London, England.
  22. Stolcke A., K. Ries, N. Coccaro, E. Shriberg, R. Bates, D. Jurafsky, P. Taylor, R. Martin, C. Van Ess-Dykema, & M. Meteer. 2000. Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech. Computational Linguistics. 26(3), 339-373.
  23. Twitchell, Douglas P., Jay F. Nunamaker Jr., and Judee K. Burgoon. 2004. Using Speech Act Profiling for Deception Detection. Intelligence and Security Informatics. LNCS, Vol. 3073
  24. Warner, R. 1985. Discourse Connectives in English. Garland Publications, New York, New York.
  25. Webb, N., M. Hepple and Y. Wilks. 2005. Dialogue Act Classification using Intra-Utterance Features. In Proceedings of the AAAI Workshop on Spoken Language Understanding, Pittsburgh, USA.
  26. Zipf, G. 1949. Human Behavior and the Principle of Least Effort. Addison-Wesley.
  27. Zukerman, I. and J. Pearl. 1986. Comprehension-Driven Generation of Meta-Technical Utterances in Math Tutoring. In Proceedings of the Sixth National Conference of the American Association for Artificial Intelligence. Philadelphia, Pennsylvania.

Paper Citation

in Harvard Style

Shaikh S., Strzalkowski T. and Webb N. (2011). CLASSIFICATION OF DIALOGUE ACTS IN URDU MULTI-PARTY DISCOURSE . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 398-404. DOI: 10.5220/0003637304060412

in Bibtex Style

author={Samira Shaikh and Tomek Strzalkowski and Nick Webb},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},

in EndNote Style

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
SN - 978-989-8425-79-9
AU - Shaikh S.
AU - Strzalkowski T.
AU - Webb N.
PY - 2011
SP - 398
EP - 404
DO - 10.5220/0003637304060412