Subscribe Now Subscribe Today
Science Alert
 
Blue
   
Curve Top
Journal of Applied Sciences
  Year: 2014 | Volume: 14 | Issue: 2 | Page No.: 171-176
DOI: 10.3923/jas.2014.171.176
 
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail

How the Parameters of K-nearest Neighbor Algorithm Impact on the Best Classification Accuracy: In Case of Parkinson Dataset

Chih-Min Ma, Wei-Shui Yang and Bor-Wen Cheng

Abstract:
Parkinson’s disease is a chronic and progressive neurological disease, which affects nervous system in a division of the brain that controls muscle movements. This disease is complicated and can be difficult to diagnose accurately in their early stages that prompts the researchers to try various classification methods to separate the healthy from the PD subjects. The K-Nearest Neighbor (KNN) classifier is one of the most heavily usage and benchmark in classification. In this study about KNN approach, there are two specific issues to be explored. The first one is to determine and obtain the optimal value of k; another issue is to identify the effects of distance metric and normalization in KNN classifier with Parkinson dataset. This study utilizes a series of data split which the percentage of training data gradually increase from 5% to Leave One Out cross-validation. As can be seen from the results, the classification accuracy in the classification of raw PD dataset, either Euclidean distance or Manhattan distance is used to calculate the accuracy that are always lower than normalized dataset. The best classification accuracy corresponds to the k neighboring points is also different under various ratios of training and testing data, its means that the best answer k is not arbitrarily chosen, it should be obtained by calculating carefully. While the percentage of the training set accounts for 95%, using normalization by min-max with Euclidean distance has achieved promising performance, 96.73±5.97% (when k = 1). As to the accuracy of the LOOCV, 96.41 (k = 1), is very close to the optimum value and there is no variation. Although, the results of LOOCV would not the best, it has been more popular in the case of stringent forecast.
PDF Fulltext XML References Citation Report Citation
How to cite this article:

Chih-Min Ma, Wei-Shui Yang and Bor-Wen Cheng , 2014. How the Parameters of K-nearest Neighbor Algorithm Impact on the Best Classification Accuracy: In Case of Parkinson Dataset. Journal of Applied Sciences, 14: 171-176.

DOI: 10.3923/jas.2014.171.176

URL: https://scialert.net/abstract/?doi=jas.2014.171.176

COMMENTS
06 September, 2015
Ari Mohammed Saeed:
Thanks for this paper.
COMMENT ON THIS PAPER
 
 
 

 

 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 

Curve Bottom