Programs Similarity Measure Based on Tree Structure and Eigenvector
Abstract:
Program similarity measure technology is to detect the similarity
among the programs by certain means. It is widely used in teaching and protection
of intellectual property rights. Most current program similarity measure technologies
suffer from low accuracy. Based on previous studies of program similarity measure
method, this study proposes a method based on tree structure and eigenvector.
Firstly, the actual frequency of keywords in the program is counted through
employing hierarchical tree structure. Sencondly, the frequency is applied to
generate eigenvector of program and the traditional method based on vector is
improved. Finally, a program similarity measure system named Cplag is implemented
which can be used to measure C language program similarity. Experimental results
indicate that CPlag has apparent advantages in some aspects compared with famous
Jplag.
How to cite this article
Dongmei Li, Di Zhang, Zhifang Wei and Jianxin Wang, 2013. Programs Similarity Measure Based on Tree Structure and Eigenvector. Journal of Applied Sciences, 13: 2843-2847.
REFERENCES
Aimmanee, P., 2011. Automatic plagiarism detection using word-sentence based S-gram. Chiang Mai J. Sci., 38: 1-7.
Donaldson, J.L., A.M. Lancaster and P.H. Sposato, 1981. A plagiarism detection system. Proceedings of the 12th SIGCSE symposium on Computer science Education. February 4-6 1981, New York, USA., pp: 21-25.
Faidhi, J.A.W. and S.K. Robinson, 1987. An empirical approach for detecting program similarity and plagiarism within a university programming environment. Comput. Edu., 11: 11-19.
CrossRef
Grier, S., 1981. A tool that detects plagiarism in Pascal programs. Proceedings of the 12th SIGCSE Symposium on Computer Science Education, February 4-6, 1981, New York, USA., pp: 15-20.
Huang, L.L., H.Y. Huang and S.M. Shi, 2010. Method for fingerprint selection orienting to code similarity detection. Comput. Engin. Applic., 46: 169-171.
Inoue, U. and S. Wada, 2012. Detecting plagiarisms in elementary programming courses. Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), May 29-31, 2012, Chongqing University, pp: 2308-2312.
Jones, E.L., 2001. Metrics based plagarism monitoring. J. Comput. Sci. Colleges, 16: 253-261.
Kamiya, T., S. Kusumoto and K. Inoue, 2002. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. Trans. Software Eng., 28: 654-670.
Prechelt, L., G. Malpohl and M. Philippsen, 2002. Finding plagiarisms among a set of programs with JPlag. J. Univ. Comput. Sci., 8: 1016-1038.
Direct Link
Schleimer, S., D.S. Wilkerson and A. Aiken, 2003. Winnowing: Local algorithms for document fingerprinting. Proceedings of the ACM SIGMOD International Conference on Management of Data, June 9-12, 2003, San Diego, California, USA., pp: 76-85.
Whale, G., 1988. Plague: Plagiarism detection using program structure. Department of Computer Science Technical Report 8805, University of NSW, Kensington, Australasian.
Wise, M.J., 1992. Detection of similarities in student program: YAP'ing may be preferable to Plague'ing. Proceedings of the 23rd SIGCSE Technical Symposium on Computer Science Education, March 5-6, 1992, Kansas City, Missouri, USA., pp: 268-271.
Wise, M.J., 1996. YAP3: Improved detection of similarities in computer program and other texts. Proceedings of the 27th SIGCSE technical symposium on Computer Science Education, March 10-12, 1996, New York, NY, USA., pp: 130-134.
Xiong, H., H.H. Yan and T. Guo, 2010. Code similarity detection: A survey. Comput. Sci., 37: 9-14.
Zhao, C.H., H.H. Yan and M.Z. Jin, 2008. Approach based on compiling optimization and disassembling to detect program similarity. J. Beijing Univ. Aeronautics Astronautics, 34: 711-715.
Direct Link
© Science Alert. All Rights Reserved