A Similarity Detection Platform for Programming Learning

Yuanyuan Li, Yu Sheng, Lei Xiao, Fu Wang

2015

Abstract

Code similarity detection has been studied for several decades, which are prevailing categorized into attributecounting and structure-metric. Due to the one fold validity of attribute-counting for full replication, mature systems usually use the GST string matching algorithm to detect code structure. However, the accuracy of GST is vulnerable to interference in code similarity detection. This paper presents a code similarity detection method combining string matching and sub-graph isomorphism. The similarity is calculated with the GST algorithm. Then according to the similarity, the system determines whether further processing with the sub-graph iIsomorphism algorithm is required. Extensive experimental results illustrate that our method significantly enhances the efficiency of string matching as well as the accuracy of code similarity detecting.

References

  1. Donaldson, L. John, Ann-Marie Laricaster and H. Paula Sposato. A Plagiarism Detection System. Twelfth SIGCSE Teachnical Symposium, St. Louis, Missouri, 1981:21-25.
  2. G. Whale. Identification of Program Similarity in Large Populations [J]. The Computer Journal, 1990, 33(2):140-146.
  3. D. Gitchell and N. Tran. Sim: A Utility for Detecting Similarity in Computer Programs [C]. In Proceedings of the 30th SIGCSE Technical Symposium, March 1999.
  4. Michael J. Wise. YAP3: Improved Detection of Similarities in Computer Program and other Texts [J]. Department of Computer, University of Sydney, 2003.
  5. M. H. Halstead. Elements of Software Science [J]. Elsevier computer science library, New York, 1977 (17):5-7.
  6. K. L. Verco, M. J. Wise. Software for Detecting Suspected Plagiarism: Companng Structure and Attribute-Counting Systems [J]. Computer Science, University of Sydney, 1996:3-5.
  7. J. A. W. Faidhi and S. K. Robinson. An Empmcal Approach for Detecting Program Similarity within a University Programming Environment [J]. Computers and Education, 1987, 11(1):1-19.
  8. Michael Gilleland. Levenshtein Distance, in Three Flavors [J]. http://www.Merriampark.com/ld.htm.2007-4-18.
  9. Michael J. Wise. Detection of Similarities in Student Programs: YAP'ing May Be Preferable to Plague'hag [J]. SIGSCI Technical Symposium, Kansas City, USA, March 5-6, 1992:268-271.
  10. Evgeny B. Krissinel and Kim Henrick. Common subgraph isomorphism detection by backtracking search [J]. Software-Practice and Experience 2004(34):591-607(DOI: 0.1002/spe.588).
  11. Michael J. Wise. String Similarity Via Greedy String Tiling and Running Karp Rabin Matching [J]. Department of Computer Science, University of Sydney. December 1993.
  12. M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness [J]. Freeman, 1979.
Download


Paper Citation


in Harvard Style

Li Y., Sheng Y., Xiao L. and Wang F. (2015). A Similarity Detection Platform for Programming Learning . In Proceedings of the 7th International Conference on Computer Supported Education - Volume 1: CSEDU, ISBN 978-989-758-107-6, pages 480-485. DOI: 10.5220/0005490304800485


in Bibtex Style

@conference{csedu15,
author={Yuanyuan Li and Yu Sheng and Lei Xiao and Fu Wang},
title={A Similarity Detection Platform for Programming Learning},
booktitle={Proceedings of the 7th International Conference on Computer Supported Education - Volume 1: CSEDU,},
year={2015},
pages={480-485},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005490304800485},
isbn={978-989-758-107-6},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 7th International Conference on Computer Supported Education - Volume 1: CSEDU,
TI - A Similarity Detection Platform for Programming Learning
SN - 978-989-758-107-6
AU - Li Y.
AU - Sheng Y.
AU - Xiao L.
AU - Wang F.
PY - 2015
SP - 480
EP - 485
DO - 10.5220/0005490304800485