Authors: ASHA PANDIAN, JEBARAJAN THAVEETHU
Abstract: Mining of association rules tries to identify the existence of promising and fruitful relations among the items present in a database. The basic a priori algorithm suffers from multiple database scans, and if the database is large, then the time taken for scanning and generation of candidates is also large. The proposed algorithm attempts to reduce the repeated scanning of the whole database. Using this algorithm, scanning time and also the generation of subitems that are not frequent can be reduced. The former can be done by sorting the transaction records in descending order based on the size of transaction (SOT) and scanning only those transactions whose SOT is greater than or equal to k (size of item sets). The latter can be done by analyzing the item set state. It is not required to generate the next set of candidate item sets using those item sets that are not frequent. Both positive and negative mining has been done in R Studio of the R data mining tool using the R language. Experimental results show that the SOT algorithm performs better than the Apriori, Eclat, PVARM (partition-based validation for association rule mining), and NRRM (nonredundant rule method) algorithms. The work has been tested against various standard datasets such as Adult, Genome, Groceries, and SER (State Electricity Rate) Prediction. The speed-up and efficiency parameter values obtained from the algorithm strongly suggest that the proposed SOTARM algorithm has attained better performance when compared to all the other existing algorithms.
Keywords: Association rules, Apriori, item set state, positive rules, size of transaction, and confidence
Full Text: PDF