Unsupervised Sentiment Analysis Approach Based On Clustering For Arabic Text

Abstract:

Sentiment analysis is an area of great interest in research because of its importance and its advantages in many different domains. Many supervised methods and techniques are used in the existing literature to analyze the sentiment of texts, which usually needs manual labeling for training data that takes effort and time. In this paper, we propose a new sentiment analysis approach that uses unsupervised learning technique in sentiment analysis for labeling the Arabic text. The mechanism of this sentiment analysis approach is illustrated by applying clustering to detect the polarity of the text on Arabic multi-domain datasets that include positive and negative reviews across multiple domains. The approach involves different phases, preprocessing, feature extraction, applying K-means algorithms and dimensionality reduction. To evaluate clustering, we use external evaluation and apply different measures such as purity, f-measure, and Fowlkes{Mallows index. All experimental results indicate that this approach is promising to be used in labeling Arabic text which is highly desired across different domains.

nsdlogo2016