Investigating the Optimal Number of Attributes to Manage Knowledge Performances

Abstract:

 Rules are the most important element in knowledge extraction. The performance or strength of rules will determine how good a model is.  Higher accuracy implies that a model is good and vise versa.    However, the strength of rules depends on the attributes.  The number of attributes in a rule can influence the percentages of accuracy a model.  Most machine learning techniques produce a large number of rules. The consequence is with large number of rules generated, processing time is much longer. This study investigated the performances of rules with different lengths of attribute and identified the optimal number of rule for a good model. The research performed experiments using several data mining techniques. Data of 50 hardware dataset companies which, contains 31 attributes and 400 records was used.  Results showed that in terms of number of rules, Genetic Algorithm produced the highest number of rules followed by Johnson’s Algorithm and Holte’s 1R.  The best classifier for extracting rules in this study is VOT (Voting of Object Tracking).   In terms of performance of rules, best results comes from rules with 30 attributes, followed by rules with 1 intersection attribute and lastly rules with 3 intersection attributes.  Among the three sets of attributes, the set with 3 attributes are considered as the best and three (3) has been identified as the optimal number of attributes.