Automated Arabic Text Categorization Using SVM and KNN

Abstract:

Text classification is a supervised learning technique that uses labeled training data to derive a classification system (classifier) and then automatically classifies unlabelled text data using the derived classifier. In this paper, we investigate K-Nearest Neighbor method (KNN) and Support Vector Machine algorithm (SVM) on different Arabic data sets. The bases of our comparison are the most popular text evaluation measures. The Experimental results against different Arabic text categorization data sets reveal that SVM algorithm outperforms the KNN with regards to all measures.