loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Kostas Fragos 1 and Christos Skourlas 2

Affiliations: 1 NTUA, Greece ; 2 TEIA, Greece

ISBN: 978-972-8865-50-4

Abstract: In this paper, a novel method for the authorship identification problem is presented. Based on character level text segmentation we study the disputed text’s N-grams distributions within the authors’ text collections. The distribution that behaves most abnormally is identified using the Kolmogorov - Smirnov test and the corresponding Author is selected as the correct one. Our method is evaluated using the test sets of the 2004 ALLC/ACH Ad-hoc Authorship Attribution Competition and its performance is comparable with the best performances of the participants in the competition. The main advantage of our method is that it is a simple, not parametric way for authorship attribution without the necessity of building authors’ profiles from training data. Moreover, the method is language independent and does not require segmentation for languages such as Chinese or Thai. There is also no need for any text pre-processing or higher level processing, avoiding thus the use of taggers, parsers, feature selection strategies, or the use of other language dependent NLP tools. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.231.228.109

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Fragos K.; Skourlas C. and (2006). An N-gram Based Distributional Test for Authorship Identification.In Proceedings of the 3rd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2006) ISBN 978-972-8865-50-4, pages 139-148. DOI: 10.5220/0002474701390148

@conference{nlucs06,
author={Kostas Fragos and Christos Skourlas},
title={An N-gram Based Distributional Test for Authorship Identification},
booktitle={Proceedings of the 3rd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2006)},
year={2006},
pages={139-148},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002474701390148},
isbn={978-972-8865-50-4},
}

TY - CONF

JO - Proceedings of the 3rd International Workshop on Natural Language Understanding and Cognitive Science - Volume 1: NLUCS, (ICEIS 2006)
TI - An N-gram Based Distributional Test for Authorship Identification
SN - 978-972-8865-50-4
AU - Fragos, K.
AU - Skourlas, C.
PY - 2006
SP - 139
EP - 148
DO - 10.5220/0002474701390148

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.