Enhanced CBA algorithm Based on Apriori Optimization and Statistical Ranking Measure

Abstract:

Association Classification (AC) technique is a predictive approach that has been investigated widely in the last two decades. Many researchers attempted to use AC in real-world applications such as: text classification, medical diagnoses, fraud detection and website phishing. However, there are a few concerns about using this technique and they are as follows: first, it generates too many rules that consume a lot of time and memory compared with the classical data mining techniques. Second, the support and confidence threshold values are estimated by the user and hence the ranking process is always affected by these values. To overcome these issues, we investigated a modified technique to enhance the performance of the Apriori algorithm used in Classification Based on Association (CBA). The technique we discovered that works well involves using the harmonic mean measure to generate more confident rules and to improve the process of ranking the generated rules. To prove our idea, we compared the new algorithm named Enhanced CBA (ECBA) with the three well-known AC algorithms, namely CBA, Classification based on Multiple Association Rules (CMAR) and Multi-Class Classification based on Association Rule (MCAR). The comparison was based on ten gold standard UCI datasets using five common measures including: building time model, accuracy, precision, recall and the F-measure. Experimental results showed that ECBA overcomes the other AC techniques in terms of accuracy, precision, recall, and F-measure. It also showed that ECBA enhanced the building time model of the original CBA algorithm.

nsdlogo2016