Subscribe Now Subscribe Today
Research Article

Programs Similarity Measure Based on Tree Structure and Eigenvector

Dongmei Li, Di Zhang, Zhifang Wei and Jianxin Wang
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail

Program similarity measure technology is to detect the similarity among the programs by certain means. It is widely used in teaching and protection of intellectual property rights. Most current program similarity measure technologies suffer from low accuracy. Based on previous studies of program similarity measure method, this study proposes a method based on tree structure and eigenvector. Firstly, the actual frequency of keywords in the program is counted through employing hierarchical tree structure. Sencondly, the frequency is applied to generate eigenvector of program and the traditional method based on vector is improved. Finally, a program similarity measure system named Cplag is implemented which can be used to measure C language program similarity. Experimental results indicate that CPlag has apparent advantages in some aspects compared with famous Jplag.

Related Articles in ASCI
Search in Google Scholar
View Citation
Report Citation

  How to cite this article:

Dongmei Li, Di Zhang, Zhifang Wei and Jianxin Wang, 2013. Programs Similarity Measure Based on Tree Structure and Eigenvector. Journal of Applied Sciences, 13: 2843-2847.

DOI: 10.3923/jas.2013.2843.2847



1:  Aimmanee, P., 2011. Automatic plagiarism detection using word-sentence based S-gram. Chiang Mai J. Sci., 38: 1-7.

2:  Donaldson, J.L., A.M. Lancaster and P.H. Sposato, 1981. A plagiarism detection system. Proceedings of the 12th SIGCSE symposium on Computer science Education. February 4-6 1981, New York, USA., pp: 21-25

3:  Faidhi, J.A.W. and S.K. Robinson, 1987. An empirical approach for detecting program similarity and plagiarism within a university programming environment. Comput. Edu., 11: 11-19.
CrossRef  |  

4:  Grier, S., 1981. A tool that detects plagiarism in Pascal programs. Proceedings of the 12th SIGCSE Symposium on Computer Science Education, February 4-6, 1981, New York, USA., pp: 15-20
CrossRef  |  

5:  Huang, L.L., H.Y. Huang and S.M. Shi, 2010. Method for fingerprint selection orienting to code similarity detection. Comput. Engin. Applic., 46: 169-171.

6:  Inoue, U. and S. Wada, 2012. Detecting plagiarisms in elementary programming courses. Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), May 29-31, 2012, Chongqing University, pp: 2308-2312

7:  Jones, E.L., 2001. Metrics based plagarism monitoring. J. Comput. Sci. Colleges, 16: 253-261.

8:  Kamiya, T., S. Kusumoto and K. Inoue, 2002. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. Trans. Software Eng., 28: 654-670.

9:  Prechelt, L., G. Malpohl and M. Philippsen, 2002. Finding plagiarisms among a set of programs with JPlag. J. Univ. Comput. Sci., 8: 1016-1038.
Direct Link  |  

10:  Schleimer, S., D.S. Wilkerson and A. Aiken, 2003. Winnowing: Local algorithms for document fingerprinting. Proceedings of the ACM SIGMOD International Conference on Management of Data, June 9-12, 2003, San Diego, California, USA., pp: 76-85.
CrossRef  |  

11:  Whale, G., 1988. Plague: Plagiarism detection using program structure. Department of Computer Science Technical Report 8805, University of NSW, Kensington, Australasian.

12:  Wise, M.J., 1992. Detection of similarities in student program: YAP'ing may be preferable to Plague'ing. Proceedings of the 23rd SIGCSE Technical Symposium on Computer Science Education, March 5-6, 1992, Kansas City, Missouri, USA., pp: 268-271
CrossRef  |  

13:  Wise, M.J., 1996. YAP3: Improved detection of similarities in computer program and other texts. Proceedings of the 27th SIGCSE technical symposium on Computer Science Education, March 10-12, 1996, New York, NY, USA., pp: 130-134
CrossRef  |  

14:  Xiong, H., H.H. Yan and T. Guo, 2010. Code similarity detection: A survey. Comput. Sci., 37: 9-14.

15:  Zhao, C.H., H.H. Yan and M.Z. Jin, 2008. Approach based on compiling optimization and disassembling to detect program similarity. J. Beijing Univ. Aeronautics Astronautics, 34: 711-715.
Direct Link  |  

©  2022 Science Alert. All Rights Reserved