Arabic Text Classification Using Improved KNN Algorithm Based on K-means Clustering

Abstract:

With the rapid growth of Internet a huge number of text information in web pages are increased extremely. This increasing requires a smart technique to organize and filter the unlabeled information retrieved by search engine and make them useful for end user. Text classification is one of important techniques used to categorize documents into a predefined categories based on their content. In this paper we proposed a new Arabic text classifier using improved KNN classifier, we suggested that the accuracy of KNN classifier improved and also its speed, this is can be achieved by conducting additional preprocessing step which involve clustering process using K-means algorithm.